Signals and Threads - Building a UI Framework with Ty Overby
Episode Date: October 6, 2021Ty Overby is a programmer in Jane Street’s web platform group where he works on Bonsai, our OCaml library for building interactive browser-based UI. In this episode, Ty and Ron consider the function...al approach to building user interfaces. They also discuss Ty’s programming roots in Neopets, what development features they crave on the web, the unfairly maligned CSS, and why Excel is “arguably the greatest programming language ever developed.”You can find the transcript for this episode on our website.Some links to topics that came up in the discussion:Jane Street’s Bonsai libraryThe 3D design system OpenSCADMatt Keeter’s libfive design toolsTry .NET in-browser replJane Street’s Incr_dom libraryThe Elm Architecture “pattern for architecting interactive programs”React JavaScript libraryThe Houdini proposalSvelte UI toolkit
Transcript
Discussion (0)
Welcome to Signals and Threads,
in-depth conversations about every layer of the tech stack from Jane Street.
I'm Ron Minsky.
It's my pleasure to be talking today with Tai Overby.
Tai is a software engineer who's been at Jane Street since 2018,
and his work here has been focused on our tools for building web applications,
and specifically on a library he designed called Bonsai. We're going to talk a lot about the ideas behind Bonsai and about UI
development more generally. But first, Ty, can you tell us a bit more about your background
before Jane Street? Yeah, so I think my first experience with web development was probably
when I was in middle school writing pet pages for my Neopets. After that, I've been doing hobby programming
ever since I was a kid, went to the University of Washington, and worked on compilers for a while
at Microsoft. I decided at some point to move to New York, so Chain Street seemed like a great
place to continue my love for programming languages, and somehow managed to get on a
team doing some programming languages and web development
stuff, which I think is pretty exciting. So you got to combine two different loves.
Yeah. You talked a little bit about your hobby programming. I know that you're someone who
really spends time and really enjoys digging into all sorts of things in your spare time.
Can you tell us a little bit more about what kind of things you've done on the side?
Yeah. So I guess a running theme of mine is that I wind up taking the hammer
that is compiler development
and bashing everything that I can get my hands on with it.
I think my biggest, longest running project
is probably a computer-aided design library suite of software
where in order to describe the shapes for your object,
you write programs, little functions
that go from some point in space to the
distance to the outline of your shape. Then there's all sorts of fun combinators that you can write on
top of this and all sorts of fun optimizations that you can do if you somewhat limit the
capabilities of the person using it, taking compiler optimizations to the computer-aided
design workflow.
So you're diving right into the abstractions in the math, but to step back for a second,
the thing you're talking about, as I understand it, is what's called programmatic CAD,
which is instead of having something that's the moral descendant of McPaint or something,
where you go in and interactively construct some 3D object that's going to then be printed in a
3D printer or constructed in some other way, you write a program to do that. And there's actually
lots of stuff out there that lets you do this already. One of them that I've played around
myself and with my kids is called OpenSCAD. Yeah, so OpenSCAD is programmatic CAD. I believe
it operates on a boundary representation. So this is very similar to what you would get in CAD software or Blender
or any of the other 3D modeling softwares, which is that the fundamental primitive is the points
or a face, whereas the style that I'm particularly interested in as a functional programmer is called
functional representation or F-rep. And these two different styles of representing shapes lead to some really interesting trade-offs
some things that are really easy in b reps or boundary representation are very hard in f reps
and vice versa right and just to see if i have the description right in f rep is something where
instead of describing explicitly through a collection of polygons or whatever the boundary
of some surface you write down a function and thengons or whatever, the boundary of some surface,
you write down a function, and then the boundary is something like the zeros of that function.
So implicitly, you're describing the boundary by just writing down what the function is.
They are also called implicit surfaces.
Why is it better to use implicit surfaces versus a boundary representation?
You said there are some things that are easier and some things that are harder. Can you give me a sense, as someone who might use this,
why I might prefer the implicit representation?
Oh, I think a lot of the combinators are trivial to express. So things like taking two shapes and
finding their union is, well, the two shapes, they're two functions. So you apply them both,
and then you take the maximum of whichever of the two points. And the function that combines these now describes the shape that is the
union of those two shapes. These functions return numbers that represent the distance to the edge.
So zero is on the edge. Negative numbers are inside and positive numbers are outside. So union would
be max, intersection would be min, and neg, or taking a shape and making the outside inside and the inside outside
is just literally negation.
And you can use negation and union to get cuts.
And also something that you can't do in boundary representation stuff is just write some code.
And I really enjoy just being able to think of a function and see what kind of shape it
would create.
Right.
Although, I mean, you can write code. OpenSCAD lets you write code that generates
boundary representations and manipulates those boundary representations. So can't you
do it programmatically in both styles?
Probably. I think some of the shapes that I'm interested in are things like the interference
pattern between two sine waves, each of them describing distances to the outside of a shape
and cutting that off via a
circle. And the cutting off via a circle, OpenSCAD can do just fine. But describing a shape implicitly
is something that implicit surfaces do pretty well. And the type of programming that you do
is very different. And I can immediately see how your functional programming background
leaks into the way you talk about CAD, right?
I think people who are doing CAD don't normally talk about combinators.
They might talk about the operations you use for combining shapes.
And I guess those two are one in the same in the way that you're describing it.
Yeah.
As I understand it, there are other things that are supposed to be good about implicit surface style representations. Like I've used OpenSCAD a lot. And one thing that's really annoying about it is you have to decide early on when you're
constructing the boundaries, how detailed the boundary representation is going to be.
So there's essentially almost a tessellation of the surface or an approximation of the
surface by a bunch of little tiny polygons.
And then when you do things like merge things together, you can get all sorts of weird artifacts
that come from exactly the kind of beats in the
frequencies of the different polygonalizations of the two shapes and get all sorts of weird
things that happen on the boundaries. And I think the implicit surface approach gives you
something where while you're doing all of your transformations, you have everything in some sense
at full resolution. And then finally, when you render it, you get to decide how you want to
do the final display and how detailed that needs to be. Yeah, that's exactly right. Sadly, one of the downsides is that doing that
final transformation from F rep to B rep, which is useful if you want to 3d print something or
pull it into a game engine. That's really hard. As far as I can tell, no one has done it perfectly
yet. There's an amazing project by Matt Keeter called Lib5,
which I've built a ton of stuff on top of,
and I've loved using that, and it's very good,
but it still has issues with weird edge cases.
In this case, literally edge cases.
You can imagine interesting things happening to a shape
as points go to infinity or negative infinity.
And unless you're very careful, you can easily create topologies that are hard or impossible
to render into a regular 3D file.
Right. And I guess if you have points that go off to infinity,
it's probably hard to find a 3D printer big enough to actually render the resulting surface.
It's true.
Okay. So that's a fun kind of hobby.
You said you did a bunch of web development
on the side as part of your personal work.
Have you done web development
in a professional context as well before your work here?
Yeah, so I helped out on a project at Microsoft
building a in-browser REPL.
I don't know if any of my code is still alive in there,
but I think that's it, actually.
I think that was the only brush that I had with web dev professionally before Jane Street.
Got it. And then somehow, despite the lack of any professional background, nonetheless,
you ended up building the library that is now Jane Street's default library for building user
interfaces, which is called Bonsai. So let's talk about that a little bit. So when you arrived here,
we already had some foundation for doing web development. There was a library called
Inkerdom, and this was based on a compiler component, which we still use called JS of
OCaml. So JS of OCaml is a backend that takes OCaml and converts it into JavaScript,
doing actually a surprisingly good job of respecting OCaml semantics. And we had successfully built a number of user
interfaces using Incredom and JSF OCaml, and also run into some trouble with it. Maybe you can tell
us a little bit about what Incredom looked like to you when you arrived and explain how it works
for someone who's never touched it before. For people that are familiar with the Elm architecture,
Incredom would seem quite natural. Let's back up and talk with the Elm architecture, Incredom would seem quite natural.
Let's back up and talk about the Elm architecture for a little bit. Elm is a programming language
built entirely for developing user interfaces in the browser. Certainly its standard library
is very focused on this. And it also comes with a design pattern. And that design pattern is separating the logic of your application into
a view function that takes inputs and the model for an application and produces a pure functional
representation of its view. And then also a apply action function, which can take an existing model for the application and
an action that is triggered either by the user interacting with the page in some way
or by interactions with a WebSocket connection or other I.O.
And it'll take this action and the current model and produce a new model.
And then the render function gets called again with the current model and produce a new model. And then the render function gets called
again with the new model. And hopefully the changes that you want to see represented on the
screen make their way there. Great. And so this is in some sense, just another spin on the classic
model view controller pattern and model is your representation of data view is what you're going
to display. And the controller is somehow how you interact with the outside world and do transformations
to your model, right?
And I guess this whole thing will sound very familiar to anyone who's used React, where
React is also based on a notion of a pure functional representation of the DOM.
You have some just pure data structure that represents the DOM you wish you had, and it
gets slammed into the actual messy, complicated tree of objects that is the
actual DOM in the browser in a hopefully efficient way. Can you say maybe a word about how Elm
architecture differs from what a React programmer might be used to? On the surface, they appear to
be very similar. The one thing that React users typically reach for is a state management library.
And this is changing over time. I think it's a bit less
common now that React has adopted some of this into the library itself. But there were some very
popular libraries like Redux or MobX that did the pure functional model updating that Elm has
kind of built into its framework. Prior to this, the way that you did state in React was by putting
state in your components. And this is handy. It's private. You don't need the rest of your
application to know about this state. But it also has some problems, like that if a component is
removed from the component tree and then added back in again, the state is gone. So it's a bit
more like having a mutable variable inside of your
component tree. And so people reached for state management solutions that provided the model to
action to model transformation that Elm has. And just to maybe say it in a slightly different way,
the Elm approach is to some degree, like a good first order approximation is your UI is a function
and the key function is a function from model to view. And it really is just a pure function where the model is just simple data and
the view is just simple data and in fact, immutable data at that. And then there's this extra action
step where inside of the view, there's some essentially description of how actions can be
used to update the model, say when you click click a button. And that leads to an event which asynchronously will later then cause the model to be updated.
So you don't get this whole encapsulation of hiding state inside of components.
Instead, you have this pure and in some sense more totalistic, like, no,
just the whole UI is a function.
And then we compose functions in the ordinary way that we compose functions
in the context of functional programming.
Yeah, exactly.
And I think a major selling point for libraries like this is your application as a function,
where you take the state and you transform it into a view.
But really, I think that that's the least interesting part.
I think the way that you get applications that people want to use is by having really
well-defined and edge case-free interactions.
And that means transitioning from one model to the next in a way that makes sense to your users.
And I think that modeling things as a state machine like Elm does and like some of these
React state managers is a great way to do that.
So that's like a new piece of jargon, a state machine.
When you say it's modeled as a state machine, what does that concretely mean?
Yeah, so I guess the abstract notion of a state machine is you have some state,
and then you have actions that express a desire to change the state,
and another function that takes an action and state and produces a new state.
This separation allows programs to more accurately read when interesting things happen, as opposed to direct
mutation. So you might notice a few actions coming in altogether, and you could debounce them to
separate multiple actions into just a single one. You could imagine a user accidentally hitting a
button twice or typing something a bit too fast, and you might want to put some guardrails on that
and make sure that they aren't doing anything accidentally.
This is the type of thing where if in an action handler
you were directly mutating some state,
it might be a bit harder to do this.
But if you're representing everything as transitions between states
that are described by actions that are raised by events,
then this kind of intentional transition becomes a lot more
possible. Which is to say, you might think about something like this as being primarily about two
different types of data, the model that represents the state of your application and the view,
which is how you present it to the user. But really there's three, there's the model and
there's the view. And there's also the action, which is the thing that represents transitions,
ways in which you change your model.
And you have some kind of explicit way of talking about the way in which transitions cause the state
to update. And it's that structure that you're referring to as a state machine.
Yeah, exactly.
Great. So how does this Elm model that you described, how does that relate to what
was like? Inkerdom took the pattern that Elm popularized and left it relatively unchanged, with the exception
of a single edition, which was combining it with a library that we have developed here internally
called Incremental. And the Incremental library is effectively a smart memoization technique, where if you write a function using the incremental
library and you run it multiple times where the inputs are changing only slightly between runs,
then you could expect some better performance if the output that you're producing is changing
slightly in response to inputs changing slightly. Right. and this goes back to a general fact I've long noted about UI frameworks,
which is that hidden inside of every UI framework is some kind of incrementalization framework as well.
Because you basically need this incrementalization for performance reasons everywhere.
UIs in general, what are they doing?
They are computing something to display to the user and those things need to
change quickly. And one of the things that can help them change quickly is they don't change
all at once. Little bits of them change. You go and click somewhere and some small part of what
you're seeing changes, not everything in the entire view being transformed all at once. And
so you need some way of computationally taking advantage of this. Maybe we should mostly leave
off those gory details here
as well. In some sense, the difference between Elm and Incredom is that Elm has a way of doing
incrementalization and Incredom has incremental, which is a more powerful way of incrementalizing,
meaning you can express more complicated algorithms incrementally. But I think that's
kind of mostly not the thing that's interesting about the gap between Incredom and Bonsai.
So maybe we shouldn't
talk too much about incremental, but I'm curious what you felt was problematic about incremental.
The story you've told so far is Elm is this nice way of describing things. And then Inkerdom takes
the Elm model and adds this more powerful incrementalization framework. That sounds great.
What doesn't work that well about the model? One of the things that I think incremental does
differently than your average React application is that it also incrementally computes the state machine transition
functions, these apply action. And this is really nice. It means that you can thread values
throughout your application and make use of them in the state machine transition functions,
just as you would threading
values through your view function.
But it does mean that combining components becomes a lot harder, because you've not only
got to take the views from multiple components and stitch them together into one parent view,
but also you need to compose the apply actions for child components into this super
component. And the way that this is done with incremental is very boilerplate heavy. The
downside of having this library that expresses a lot more of the power of incremental computing
means that these compositions were quite verbose. So this whole point about the action application function itself being computed incrementally,
on the face of it seems very abstract.
Can you give an example of where that gives you something useful?
We have this library that implements a UI component for very large tables.
And one of the things about this table that's quite unique is that it can sort and filter
and display an absolutely enormous
amount of data, like hundreds of thousands of rows. And really a lot of the magic there is in
the view where we only actually show a very small subset to the DOM. But during that sorting and
filtering, we produce a lot of intermediate data structures that could be very useful in downstream components.
You might imagine that the sorted and filtered table might expose internal structures that allow
you to very efficiently implement row selection, where you want to be able to navigate through the
table using a keyboard. In a naive implementation, they might have to sort and filter all of that
data again.
But because we've done it incrementally, and because the apply action function
is also computed incrementally, we can share that data a lot easier.
It sounds like the problem you see that comes out of all of this is that it makes the system,
in some sense, less composable. It's harder, or at least requires more boilerplate,
to take little pieces of a UI and combine them together to make larger
UIs.
Do you think that problem is worse for Incredom than it is for Elm?
Or do you see the same problem in both approaches?
I see the same problem in both approaches, but I think that the boilerplate that you
get from Elm is quite minimal.
And for Incredom, it was absolutely not.
I believe that when I made the simplest possible combinator for two
Incrodom components, which was just combine their models, or rather take a supermodel that splits
into submodels and apply action that would dispatch to one or the other, and a view that
combined both subviews. This was something like 50 to 70 lines of code. And what that meant was
that people were making monoliths where they didn't want to break
their code up into multiple pieces because they knew that they would have to do this
recombining at some level.
And that recombining was error prone and it was tedious.
Maybe in the opposite order, it was tedious and therefore error prone.
Great.
So this is a problem that you
wanted to solve, right? I think it seemed clear to you that it's not great to not give people a way
of building little reusable components. And if everyone's building monolithic things, they're
not going to throw off small, useful, reusable bits. And you built this library called Bonsai.
How does Bonsai help? Bonsai started out basically as a single type definition that encompassed what I saw as the right way to build an Inker DOM component.
This type signature was almost a reification of a programming pattern. that can incrementally compute a view, arbitrary extra data that might be useful
to downstream components, like we talked about earlier,
and an apply action.
And this function is given a model to act on
and a function for raising events
so that it can perform the state machine transition.
And so Bonsai really started out as that function type.
So just a bundle, you just take together the function for computing the view from the model,
the function for updating the action, the way of throwing actions out into the world,
just take all of those things and pack them together. And that's what a Bonsai component was.
Exactly. Yeah. And I was able to write some pretty nice combinators on top of this. So things like
given two components, give me a new one that combines their models, combines their apply
actions in the trivial manner. And then you could specify how you wanted the views combined as well.
But fairly early on, I recognized that the fact that UI components were computing, well, the fact that we called
them UI components at all was like a bit of a misnomer. Really, they're something producing.
And the fact that they happened to be views most of the time was a bit of an accident, really.
So among the first things that I did was remove the hard-coded view from that type.
And so now instead of a UI component that produces a view and maybe some extra data,
it was just, here is a component, not necessarily a UI component, that produced some result incrementally.
And frequently that result type would be a view, but it doesn't have to be.
I think more than half of the components
that I write for a web application won't be manipulating views at all. They will be state
machines that produce some results that I think might be interesting. And maybe they get transformed
into a view later on the line, or maybe they get passed in as inputs to other components.
Right. And this is a real break with the style of components that you see in systems like React, right? Every React component, by its nature, is producing some chunk of DOM. And here, Bonsai is a kind and its compositions in the form of a directed acyclic graph rather than as a tree.
And this is the thing that allows you to get components that produce values to thread them in to inputs for other components.
Whereas in a React-style UI, the value that you're computing incrementally must be of you.
So if you want to communicate with other components, you're almost forced to walk up the tree,
pass something on to its parent, and then that parent is responsible for shuffling down whatever
value you've computed. So that means that in React, when you break down your components,
you essentially
have to do it by slicing it along the structure of the generated DOM. That's right. The structure
of these components wind up being very closely related to the structure of the view that you
want to compute in the end. There are workarounds for this. One of the ones that I'm thinking of
now is that a parent component might produce a function that gets passed on to its children, and that function produces some more view.
And this is a good way of kind of doing a bit of a decoupling, but, well, it doesn't really match my aesthetics for how I want to be programming.
The tree style of building components in React also doesn't let you move values from one child's component of a node to another sibling.
What's the case where you'd want to do that?
One motivating example might be a text box and a button.
And you want the text box to have some text in it before the button is clickable. Now, in a tree-motivated UI component, you might have a component for your
text box, a component for your button, a callback that you pass to the text box to be notified when
there's text in it, and then a value passed into the button that says if it's disabled or not.
Whereas with a graph-oriented structure, you could have the value of the text for the text box flow into the button,
and the button could check to see if its invariants were upheld.
And the button doesn't need to be a child of the text box.
It just needs to depend on the value that it has produced.
So one thing that's changed from the original Vue is that the components aren't specifically Vue or virtual DOM producing components. They're more general components.
How else does a Bonsai type differ from what you might find in some other UI component framework?
Well, when you're comparing it to Inker DOM or Elm, one of the major differences is that
components don't actually expose what model they have, or really
even if they have a model, and they don't expose what their action type is either. So really all
that a bonsai.t has, which in bonsai, the primary type is called a computation, it's only parameterized
over what value it produces. And any state machine and transitions that it has is local to it.
Part of what's going on here is you're basically giving a good way of compartmentalizing
the model and the action. So essentially, there's a part of this which is like a function,
which takes some input and produces some output, and part of this which is a state machine. And
it gives you a way of composing together functions in a way that lets you combine state machines as well. Is that a fair
description? Yeah, depending on who I'm talking to, I'll describe Bonsai as a framework for building
incremental state machines. So given that description, how general do you think Bonsai is?
Is it useful outside of building web UIs?
Is it useful outside of UI contexts entirely?
I want to say yes, but I've been looking for an alternative example for a very long time and haven't been able to really come up with anything.
Is that true for both the question of user interfaces that are not web user interfaces?
Oh, no.
I mean, I think Bonsai is broadly applicable to user interfaces, as long as you have some
abstraction that exposes the desired view for your UI as a pure value.
That is one of the downsides of Bonsai, is that effectively everything inside of it has
to be pure.
It becomes very hard to reason about when you have mutable values. So it might be hard to build a Bonsai UI on top of platform du jour UI component library,
which is primarily mutation-based.
But it should be possible if someone wrote a virtual DOM style abstraction on top of it.
Right.
And this restriction, I guess, is essentially the same restriction that systems like React
and Elm have. And Elm, I guess, at the language level enforces purity,
and React, being in JavaScript, doesn't. And similarly with Bonsai, because it's an OCaml,
which is a language that supports mutation without restricting it, like at the level of types. Also,
it's the case that this is just a mistake you can make. If you use Bonsai and you use side
effects in the middle, it's going to be confusing, and that's kind of that.
So early on, you said that a repeated pattern in your way of thinking about the work you do is ways
to turn the problems you solve into compiler problems, into programming language problems.
And we haven't really talked about how that affects Bonsai. What is the way in which the
Bonsai work also looks like a compiler project? I've been describing Bonsai as a incrementalized directed acyclic graph.
And one of the invariants that we enforce about this directed acyclic graph at the type
level is that it can't be dynamic.
That is to say, once you've constructed the super component that represents your entire
application, all
possible combinations of components are known. And this lets us effectively do a number of
compiler optimizations. If you look at this DAG as a program, instead of as a forest of nodes
communicating with one another, then you can employ some fairly rudimentary compiler optimizations
like constant folding or collapsing multiple nodes into one
if you detect that one of them is going to run really fast
and you would rather not have the overhead of incrementalization.
In some cases, we can even transform the algorithms
that are being used on a node by node basis.
So we talked earlier a bit about the library incremental allows you to express a lot of
powerful algorithms incrementally. And one of the places where it really shines is when it deals
with map-like data structures. And we are able to pick and choose from a number of these different algorithms
at the point where Bonsai application is first evaluated to pick the one that is the most optimal
for that particular node. Right. And you're talking about this as being compiler optimizations,
but it's not like you're hacking the OCaml compiler to do something. Where in the process
of doing this does the compilation-like phase fit in?
Yeah, so it happens exactly once, and that is when you start the Bonsai application. You give
a Bonsai.t to the functional equivalent of a main method, and it'll take that, it'll crunch it,
it'll optimize it, and then it'll start running. After it starts running, we don't touch it
anymore. But at that point where the user hands off
the representation of their application,
we're able to fully traverse the entire component graph
and see where we want to make changes.
Right, so when someone is building up a Bonsai computation,
the thing they're immediately building
is something that's closer to the abstract syntax tree
of an expression that represents that computation.
And then you're running a compilation step to convert that into the real live running
computation.
Exactly.
That's why we call the thing computation.
It is an unevaluated expression, and that's what they give to us.
And we can look through the entire thing and then start running it.
Right.
So that's a great approach, but it also limits you in some ways.
Yeah. The way I think of an application written in Bonsai is really an application written in
three different programming languages. The first one is a metal language. This language happens to
be OCaml, and it manipulates computation.t and composes the structure of the application.
But once this metaprogramming is done, it's done. You don't get
to add new possibilities for components at runtime. The second language is Bonsai itself,
which when you look at code written in Bonsai, you can identify things that really look like
programming language constructs. So we have our own if, we have our own way to do recursion,
we have functions and ways to apply values to these functions.
The third programming language is the one that you alluded to, which was that the actual
business logic, the code that's running inside of these abstract notions of components is
again, OCaml.
At this level, you cannot add to the component graph.
It's restricted to performing calculations on values that are being passed through it
and applying actions to transition state machines.
But yeah, after stage one, you're kind of set in stone.
And maybe I should go back a little bit and say that when I say that the graph itself
is static after this first stage is finished, I don't mean that the application can't change
itself dynamically. So in Gmail, for example, you've got your inbox view and your compose view
and your reading someone else's email view. And in Bonsai, you can absolutely switch between all
of these at runtime. But at compile time, well, what I'm calling
compile time, it's known statically that there are those many views possible. And it can traverse
all of those separate subgraphs, even though there isn't any data actually flowing through it,
even though those components themselves aren't active right then and there.
And this is a kind of classic trade-off here where you're
taking away some real freedom from someone who is constructing the UI. There's some kind of
highly dynamic things that are maybe impossible or at least awkward to express in the context of
Bonsai. But in exchange, what you have is the ability to do more things like these kind of
optimizations that you're talking about within the framework. So in some ways, you take away power from the user of a
library, and you give power to the implementer of the library to provide better services to the user.
Exactly. Yeah. And I would say that the services that we provide go deeper than just compiler
optimizations. I've been saying that I will build a debugger for a very long time, and I haven't yet, but
it would be possible to build a very useful debugger that, up ahead of time, knows the
exact structure of your UI and allows you to navigate the DAG and poke values and see
what the states of various components are in a way that I think
would be a lot harder if we didn't have this upfront solidified representation of the application.
So one of the things that strikes me about the way that you describe all of this, it's
in some sense quite abstract, right?
There's all these fairly fancy sounding ideas about how we build computations and compose
them together and how we're going to have our own language inside of the language and
our own version of if and our own version of recursion.
It kind of sounds like a lot.
I wonder, how does that affect the ability to construct a tool that you can give to people
and they can easily use?
What's the learning curve like with Bonsai and how hard has it been for people and teams to pick it up and use it? Yeah. So I think it's getting a lot better.
It's still not great. I learned a whole lot trying to teach very early versions of Bonsai
to people. And in fact, I think we ran about 50 people through a class that included Bonsai
in that class when it was still very young.
And their frustrated expressions still motivate me today. I think that one of the things that
we've been leaning on quite heavily is the ability to extend OCaml's syntax itself to make
expressions in Bonsai look a lot like expressions in OCaml. So here I'm talking about the PPX system, which lets me
write little extensions to OCaml's syntax like let for binding variables, or if, or match for
evaluating different possibilities of data that's flowing through the Bonsai graph. So in a way,
the structure of the Bonsai code looks an awful lot like OCaml, but still has all of the
restrictions that we were talking about earlier. So this is another way of turning this into a
programming languages problem, right? You've not just inside of Bonsai itself, do you do
programming language style transformations? But you've also
moved up into the OCaml syntax itself and added essentially new language features to OCaml
to make it easier and more convenient to express Bonsai computations in a way that feels normal
to people. Yeah. And I think that it's one of the things that I really enjoy about a number of our
other libraries that operate monadically, in that you don't really need to know what a monad is in order to use them effectively.
But if they have radically different syntax than what you're used to,
it can still be an enormous pain.
And so by extending the syntax for monads,
I think that a lot of people that start at Jane Street don't know what it is,
don't really need to care.
They'll learn eventually
as they see numerous examples of monads in the code. But it looks so much like regular OCaml
that you can think of it as though it were regular OCaml. Yeah, this kind of just reminds me of the
general fact that the ability to transform the syntax of your language is such a great feature.
And it's always sad when you run into
languages that don't have it. Just having some kind of syntactic abstractions where you can
just write some piece of code that transforms it to simplify things. There are downsides to
various kinds of macro capabilities in that the macros themselves can be hard to reason about and
can sometimes give you awkward error message and all of that. But when done right, it's just enormously powerful. Part of the world and large
parts of the languages out there just never adopted. I think they're moving in that direction
though. It's really hard to think of a programming language that has been designed in the past 10
years that hasn't at least acknowledged the power of metaprogramming
and good metaprogramming in some way. I'm thinking here specifically about like
Zig and Rust as examples of languages that are very static and very serious,
and they still have excellent metaprogramming capabilities.
I guess Java just recently got closures, so maybe it'll get metaprogramming next?
Yeah, one would hope.
I think C Sharp got source code generators, which don't let you extend existing files,
but let you produce new ones based off of the contents of a syntax tree.
Yeah, and it's maybe worth saying, I think OCaml's metaprogramming facilities are okay.
They're not great. It's great that they're there and they're extremely useful. But I think people
who work in Lisp or Scheme or Racket or something in that would mostly point and laugh at OCaml's
macro system. Yeah, I think that's right. It's simultaneously too powerful in that you can do basically anything and inject arbitrary code.
But it's also because of that, it's really hard to just make a tiny one-off extension to OCaml
that you're using perhaps only within lexical scope of the definition of this macro.
It would be great if we had some kind of macro system in addition to compiler extensions.
Yeah, I agree.
And there's some exciting work in the direction of strengthening OCaml's support for various
kinds of macros.
And also, I think a thing that would be great would be to have macros exposed in a modular
way.
Like right now, what macros you're using is a thing that's kind of controlled entirely
from the build system and is not integrated in a real way in the language.
And I think that's another thing that would make the system more usable and easier to apply in more cases.
So a big part of landing a new library like Bonsai is evangelism, right? You can write the
nicest library you want, and if no one uses it, nothing good happens. How have you approached
the problem of evangelizing Bonsai and getting people excited and getting people using it inside of Jane Street?
I think one of the things that I'm still very happy that I did was backwards compatibility with Anchor Dom in that you can kind of spread throughout an Anchor Dom program, providing value to people that have existing applications. And once they see the difference between the code that they're maintaining and
the code that they're adding, there were a lot of people that were very motivated to convert
wholesale and start new applications using just Bonsai. And how has that gone in practice? How
popular now is Bonsai compared to Inkradom for people who are building new things?
I'm not aware of any new serious projects that aren't using it.
But I'll also say on the point of evangelism, I really enjoy it. Well, the main goal for Bonsai was to make people want to modularize their applications and break off bits and pieces
that other people might find useful. So whenever someone does this, I will be their evangelist.
And I maintain an enormous collection
of examples using components that I've written, but also a lot of the other people have written.
And I will make sure that any new application knows exactly what powerful features other people
at the company have contributed. What are you dissatisfied about with respect to Bonsai?
What are the remaining things about it that are problematic that you feel need to be much
better than they are now?
One of the things that I'm a bit disappointed by that isn't directly related to Bonsai,
but we still get flack for it, is composing components can lead to issues when you're
styling things with CSS.
Because if a component author wants to ship a CSS file
alongside their component, there could be issues with name clashes, where two people pick the same
name for a thing, and now they're overriding each other's styles. This is largely solved by doing
component styling in our virtual DOM library. So styles added directly to the virtual DOM nodes.
But I think that CSS is a much maligned pattern matching
language.
And the things that you can do in it are incredibly useful,
especially for expressing things that
are possible in OCaml, like highlighting
every other row in a table a
different color. But also there are just things in CSS that you can do that you couldn't do
with inline styles, like changing the appearance of an item based on whether or not it's being
hovered over, for example. So this is a pretty weak point when you're writing something using Bonsai Web. And I have a project about to get
merged that I hope will fix this, which also uses OCaml's extension language. And the way that it
works is by writing CSS directly in your OCaml file inside of an invocation to this compiler extension, it'll pull out all of the CSS,
look through the names, give them all unique names, and then expose those names to your
OCaml program so that you can use these unicified hashed name values that are coming out of a very
crisp and clean CSS string. So the goal here is to specifically get rid of this case where you essentially pun your
way into the problem, where you had two different classes for nodes and you happened to name
the same thing, and you're just going through and making sure that you have no unintended
name clashes.
Yes.
Well, in fact, we do that by just hashing the contents of your string and also the file name that you're currently in.
And appending that hash will generate something that hopefully doesn't wind up clashing with
anything else. I'll also say that the CSS language itself is not what people complain about when they
complain about CSS. What they're complaining about is the implementation of layout in the browser, which can be very confusing, and the complex interplay between various styling rules.
So these things aren't going away with a PPX.
They might go away with some other abstractions.
We hide the details of the algorithms that are used to display and layout content.
But CSS itself, no, it's quite nice, I think.
Got it.
So you think this does solve the primary problem here, which is this kind of name clash issue?
I think I can come up with other things that I don't like about Bonsai.
Right now, you need to repeat yourself a lot. Moving from one language to another, in this case, the Bonsai component language into the actual OCaml code that manipulates values at runtime
is quite verbose. I would really like to get rid of a lot of that repetition. Right now,
it's not repetition that I think would cause people to inadvertently write bugs, but you do
wind up repeating a bunch of names in multiple places,
which is a bit unfortunate. And it's something that a very recent addition to OCaml by our own
Stephen Dolan is going to hopefully make better, and that is let punning. This verbosity, the
repeating of these names is going to be fixed in a very new version of OCaml that lets you ignore cases
where the variable that you're binding to
has the same name as the value that you're binding.
So that leads to another question.
I'm curious what you think the ups and downs are
of OCaml as a language for writing web applications.
The normal language to do it is in JavaScript
and there's lots of other, you could write in Elm,
you can write in TypeScript, you can write in Dart.
How do you think OCaml measures up as a language for doing this kind of work?
I have to say that web development at Jane Street is very different from web development
anywhere else in that we have some of the fastest networks that money can buy. And because of that,
a lot of the things that would kill Bonsai on arrival
at any other company, like somewhat enormous JavaScript bundle size and not being able to do
server-side rendering in order to cut down on initial load times, or being able to split a
application into many files so that you're only loading the ones that are needed on the initial
display. Things like this that are commonplace in other languages and frameworks are simply not
present. And that's because here we primarily use our web browsers as a 2D vector graphics library
where like we don't care too much about web pages. Almost all of them are
a single page that loads a whole bunch of JavaScript, connects back to the server with
a WebSocket connection, and sits there chugging along all day. And the constraints that most web
developers have to face are just simply non-existent. Right. So for us, the web browser is just like
another place to run an OCaml program. And inCaml program that happens to generate a UI. make this very easy in a way that
JavaScript, for example, does not. Is there anything about the language itself that you
feel like gets in the way from a kind of expressiveness and convenience point of view?
Are you mostly pretty happy with OCaml as a language for this kind of work?
Yeah. Yeah, absolutely. I think if I wasn't happy with it, we wouldn't be having this conversation.
Well, you could be like happy,
but have some axes to grind about things that you'd like to make better.
I feel like it's rare to be so happy
about a programming language
that you don't have complaints.
Oh, I definitely have complaints,
but they're mostly from a library author's perspective.
I think there's not a ton that I think I desperately want
from my user's perspective. I mean, I'm going to regret saying this. As soon as we're done,
I'm going to come up with like five things that are like actually deal breakers.
So to ask a grander question still, in what ways are you dissatisfied about the web as a platform for writing user interfaces?
I think that the one thing that every UI author has to contend with is the only options we really have are JavaScript and the DOM for programming language and UI runtime. And I think that this may change as WebAssembly picks up Steam and as
they add features that allow OCaml to compile down to it, like access to the garbage collector,
or I guess ability for us to implement our own garbage collector. I think we would be happy
getting the OCaml GC to run on WebAssembly if that was also possible. But I think that the DOM is going to be here for a while.
And it does a whole bunch of really amazing things.
The fact that anyone really can create
an accessible, stylable user interface
with very little thought is unbelievably powerful.
And I think that the prevalence of the web
goes to show for just how valuable this is.
But for people building user interface components
that lie outside of what the styling
and layout engines of browsers consider normal,
it would be nice to be able to tell them,
hands off, we're going to be managing
what we consider to be the DOM tree from here on out, and be able to forward on some of the
invariants that we maintain to the web browser in order for them to make it as performant as
possible. So how different is this from just drawing your own widgets on a canvas inside of
the browser? In fact, I think I heard a week ago that Google is planning on doing this for various things
in the Google Docs suite, right?
Where they have done a lot of work over time to wrestle with the DOM and get it to do all
sorts of things that the DOM was never designed to do.
Display hundreds of thousands of lines of a spreadsheet.
And it sounds like they are giving up and they're like,
now we're just going to render to the canvas. Does that give you the flexibility that you need?
Is that enough? So yes, but at great cost, the great cost being re-implementing text boxes.
Clearly there's more than just text boxes, but I think text boxes are an incredible example of
complexity that you really don't notice until it's wrong. Things like selection not wrapping
around lines correctly, or maintaining cursor position and moving them correctly through
strings that aren't necessarily ASCII, doing text layout, doing text shaping.
These are things that, yeah, Google's going to have to do,
and they have the manpower to do it.
But if you told me that I needed to implement a text box on top of Canvas,
well, man, we wouldn't.
That's maybe best measured in years.
So if what you want is not to like have the,
just the raw ability to kind of draw stuff to the canvas,
you want something that preserves more of the power of the DOM that exists,
but it still gives people more control.
Like what would you do?
What's the magic feature you would ask for from, from the web?
I guess direct integration into the layout engine and maybe
APIs that allow us to tell the Chrome renderer what things to avoid drawing,
direct access to the layout engine,
maybe direct access to their underlying 2D graphics renderer,
and I think a more powerful integration with incoming user events, things like keyboard and mouse,
that allow us to take a bit more power.
Is this a kind of thing that people are talking about, or is this just stuff
you think would be awesome, but it's not on anyone's radar?
I think that there's a proposal called Houdini that includes a lot of this.
It's not totally clear whether or not it will be adopted as a standard
and when those features will arrive or how powerful they will be. But I think that certainly
there are people, probably the same people that are working on Google Docs, that have been bitten
by these problems and are eager to solve them. So you spent all this time on Bonsai, which is a UI toolkit.
Are there other tools for building UIs
that you are especially jealous of or have
a lot of admiration for?
Like, what are other great systems
that you've encountered for building user interfaces?
CHRIS BROADFOOTIS Yeah, so I think that the Svelte project
is a recent UI toolkit, but also programming language that has a lot of features that I think
make it very desirable for outside of Jane Street UI development. It fundamentally operates on the
basis that UI components are specified ahead of time. They are composed dynamically, but the UI components are passed through a compiler
first, and it can do optimizations. It can find places where bugs might be introduced by composing
things incorrectly. And it just feels to me like, I also haven't done any programming in Svelte,
but reading over the tutorials, it feels like the type of thing
that might have been developed for the web with hindsight.
Are there any approaches to building UIs
that you've seen outside of the web context
that you're excited about?
Like there's a whole world, like, I don't know,
there's a wizards for like composing UIs together
and all sorts of systems for building UIs
in different contexts.
Is there anything in that world
that you're excited about
that you think does something impressive and good?
No.
I think that UI-based UI design will almost always fall over.
It will not expose enough power to the people that want it.
And at that point, you need either very powerful hooks
so that developers can attach things into a UI that
was built in a GUI, or you need to transition that GUI into code and then hand it off to
the dev.
And I haven't seen any of these work particularly well.
Arguably, one of the most successful systems for building user interfaces is Excel.
I wonder how you think about that fitting in.
Excel is arguably the greatest programming language ever developed.
Certainly, its ability to convince non-programmers that they aren't doing programming is unbelievable.
It builds in an entire incremental framework for reducing recomputation of values. And it makes people integrate the data input that they have
to do with the viewing and analysis that they want to do in a way that I just really haven't
seen replicated anywhere else. Right. I mean, Excel is another case where it does the thing
where it's very limiting. You get the best behavior out of Excel when you do the maximum amount possible
in the equation language.
And the equation language is very limited.
Although excitingly,
they recently got let expressions and closures.
But still, it's a very limited language.
But once you accept those limitations,
boy, oh boy, is it a nice system to work in.
I don't want to build everything inside of Excel.
I mean, I feel like there's like a whole separate long podcast one could have about all the things that go wrong when you use Excel.. Excel is far from perfect. But as a way of constructing easy-to-use and easy-to-understand interfaces for people doing data analysis, it's pretty impressive.
Absolutely. And I think that one of the reasons for that is that there's this very tiny barrier
between seeing something on your screen, seeing a visualization or some results for a formula,
and editing the formula that produced that result. You don't need to dig through source code and find
out where that value is being produced. You click on the thing and it'll show you exactly how and
why and where those values and those computations are coming from. And it's unbelievably powerful. And I've been experiencing a different kind of visual-based programming in the Blender
program, which is a 3D modeling, animation, rendering.
But I'm using it almost entirely for shading, in which a typical shader is a source code
in a language that looks a lot like C,
and you write a program that operates on usually a single pixel, and it produces a color at the end.
Blender has a wonderful setup in the form of what they call nodes, which is this graph construction
UI where you create nodes that represent different combinators over numbers or
colors or textures. And you can visually see the output of an individual node in the editor. And
as you combine and compose all of these different nodes, build up something that is practically,
well, for me at least, literally impossible to do in code
without seeing all these intermediate steps represented so easily.
When you say seeing the intermediate steps, do you mean seeing essentially the instructions
in the intermediate steps?
Or do you mean actually being able to interrogate and see the results in the middle of your
image transformation pipeline?
The latter. There's a single key binding that
if you have a node selected, it'll change the output for the entire shader to just be that node.
And of course, well, all of the things that it depends on. But there's also an incredible plugin
that'll actually inline previews above every node that you care about. And so you can see just what does this
node contribute to the rest of the program. So part of the value here is just like a really
good debugger. Yeah. Yeah. Oh, absolutely. Right. And it makes sense. It's for a kind of programming
where a lot of the ordinary things you would do for understanding whether you're doing the right
thing just kind of don't make sense. When you're doing this numerical transformation of data, when you make mistakes,
your program doesn't do something like enormously wrong. It just looks wrong or something.
And it's hard to know what's going on if you can't visually inspect the intermediate components.
Right. And a lot of the tools that we bring to bear on programs written out in source code
just can't apply here. How do you write a test
that passes if your shader looks good? You can't. You need a human to look at it and tweak the
parameters and make it appease some art director somewhere in a movie studio. The debuggability
and inspectability is the feature. It would not be what it is if it were merely a projection into source code
or out of source code. And like with Excel, it has managed to convince a whole bunch of artists
to do programming without knowing it. And I think that's really empowering as well.
Well, I look forward to you tricking a bunch of Jane Street people into thinking that they're
not programming while
they develop UIs inside of our walls and also provide them with excellent debugging tools.
That's right. Yeah. Get back to me on that one.
Will do. Okay. Well, Ty, thanks for joining me. This has been a lot of fun. The world of UIs
is enormously complicated, both the stuff we're doing and just the whole world. And it was fun
to get your view on it all.
Thank you for having me.
This has been a blast.
You can find a full transcript of the episode
along with more information
about some of the topics we discussed,
including Ty's Bonsai Library
at signalsandthreads.com.
Thanks for joining us
and see you next time.