CoRecursive: Coding Stories - Tech Talk: Compiling to Bytecode with Thorsten Ball
Episode Date: September 1, 2019Tech Talks are in-depth technical discussions. What do compilers do? What is the runtime of a language? What does it mean to compile something down to bytecode and what executes the byte code. Throste...n Ball Answers these questions in this interview with Adam. "A virtual machine is a computer built-in software, a CPU built-in software" "Compilers can be slow. You know, I grew up running Linux and I had Gentoo running, so I basically let my computer run for the whole night to compile my window manager. So I do know how slow compilers can be and the reason they're slow is because you're paying the upfront costs that an interpreter pays at runtime. You're paying a little bit more because you're doing additional optimizations. You're shifting the cost to a point in time where you're happy to pay it." Writing a compiler in GO GCC Codebase Mirror LLVM Codebase TCC Compiler C in 4 functions 8CC - small self hosting compiler https://corecursive.com/037-thorsten-ball-compilers/
Transcript
Discussion (0)
A compiler is basically an optimization of an interpreter that is theoretically speaking.
And again, I hope that nobody listening falls off the chair.
There is no difference in their capabilities.
The example I always use is imagine that the two of us, we speak English and we have a
friend that only speaks, I speak German.
So my friend speaks only German, but he doesn't speak English.
And you're saying something to me and I want to translate it for him.
So what I could do is after every sentence that you tell me, I could take a sentence,
translate it in my head and then say to my friend, oh, he just said this or he just said
that.
Or what I could do, this would be interpretation.
What I could instead do would be to just listen to what you have to say paragraph wise, or
maybe you have prepared a whole speech.
I could listen to the whole speech and write down the German translation and then hand
it to my friend and just say, read this.
This is now in German.
You can understand this.
And then there wouldn't be this interpretation step because I just translated, I just compiled
what you said into this other language.
And now
there's no overhead of me interpreting all the time. That's a really simple example, but that's
a gist of it. And to be honest, as soon as you start getting into the real world concerning
compilers and interpreters, the lines get blurry. Hello, this is Adam Gordon-Bell. Join me as I learn about building software.
This is Code Recursive.
Today, I talked to Thorsten Ball. He wrote this book called Building a Compiler in Go.
We talk about virtual machines and bytecode and compilers, basically things that were always just kind of magic to me.
This interview is fun
and it's casual and we hit on a lot of interesting low-level topics like how do CPUs work and why
does Java and C-sharp target a virtual machine-based runtime. We also talk about how do you
write readable code, like so readable that it can stand alone in a book format? Thorson has the answers for all of these questions.
And because he comes from a perspective of a web developer,
he's really good at explaining these things
in a way that I can understand.
Thorson, I'm pretty excited to have you here
because I really liked your first book.
And now you have,
you've had a second book for a while actually, haven't you?
But I just started reading the second book.
Yeah, thanks.
It's been out since August, like a year now, basically.
August last year.
And like, this is our second interview.
I previously interviewed you on SE Radio.
So I think I'm on a theme here.
So I interviewed you on Software Engine Radio about interpreters.
And I talked to Bob Nystrom about interpreters.
And like, so this is the third installment.
So I want to talk about compilers.
So your book is called Building a Compiler in Go.
So my first question, I guess, is like,
why build a compiler?
Because it's fun.
Because compilers are cool, fascinating,
mysterious, powerful.
I don't know.
That's what drew me to compilers.
But I think in the practical
sense, there's two answers. The first one is, if you have to build a programming language,
and you're working on a team that builds a programming language, then of course,
you should know or learn how to build a compiler. And for me, the actual reason was to learn, you know, to basically get rid of a huge blind spot that I had as a college dropout, self-taught developer.
I never had visited a compiler course, so I never learned this.
And this was always this black spot for me, like this mysterious thing.
And everybody speaks with a lot of respect, you know,
when talking about compilers and I wanted to learn how they work. And then I read a bunch of blog
posts actually that recommended that it's a really valuable thing to do, because if you learn how
compilers work, you actually learn how a lot of things work in other tools. That means you
learn how parsing works. You learn what a lexer is. You learn what an AST is, an abstract syntax
tree. You learn what an immediate representation is, IR, like let's say another form of AST.
And you also learn how computers work, kind of, because you have to think about what a compiler
does, which is translating from one language into the other. That gives you a lot of context.
And that's the main thing, like learning for the sake of learning. That's why I did it. But
there's also a really practical aspect to it. And that is, you know, I kind of touched upon
this just now. But if you learn how to build a compiler or learn
about compilers, you'll see a lot of different things in a different light in your day-to-day
work.
Like you recognize these things that you built previously or that you need to build now,
you now recognize that, oh, that is actually a parser.
Oh, that is actually a form of immediate representation.
And you also
gain a lot of understanding of how languages work, which helps tremendously if you're debugging,
profiling, if you're doing performance optimizations. And this was two weeks ago,
actually, when me and two colleagues were profiling and doing performance optimizations on a
database that we use. And while looking at the profiling data, performance optimizations on a database that we use.
And while looking at the profiling data, and this was a program written in Go, one of my colleagues said, you know what, you need to know about how the runtime works, how the
language works, if you want to make any sense of what you're looking at.
And I feel that all the time.
It might be this, I don't know what the word for it is,
but as soon as you learn about a concept,
you can see it everywhere
and everything is in a different light.
And you think, oh, how could I have known this
without knowing this other thing
that I just learned about?
But I do feel that there is a great value
for learning, you know,
in learning about compilers and program language,
even if you never built
an actual general purpose programming
language.
That's very cool.
What was the example where you needed to understand how the runtime works?
It's an open source project called Zookt built by Google.
And it's a database that indexes source code.
And I work at Sourcegraph and we index source code a lot because we're building a search
engine and code intelligence platform for source code.
So we use Zookt and we also extend it for our purposes.
And we had to analyze and profile it.
And then it's like, you're looking at the profiling data,
you're looking at flame graphs, you're looking at graphs.
That means you have to understand what is a stack?
What is a call stack?
If you don't know what a call stack is,
it's really tough to make sense of flame graphs. And if you don't know what a garbage
collector is, it makes a lot harder to understand what the runtime, what is it called, malloc-gc
call is that Go has. This is basically what shows up as a GC in your profiling thing.
And then if you go deeper and try to understand what is actually happening, then you kind of start reasoning about, oh, wait, yeah, we're allocating so much memory here. This is why this profile is actually dominated by the garbage collector. It's not about speed, like we're not slow and runtime is not slow. It's just the runtime has so much stuff to do because we produce so much memory garbage that it has to collect. And it's this reasoning about and understanding
of what is actually happening that allows you to then take a step back and draw the right,
hopefully right conclusion about what to optimize and what not to optimize.
Yeah, because like certain things seem like magic to me. And I guess that since I started looking
through your book, that's one thing that always seemed kind of magical. What does a compiler do?
Or what does a runtime do?
Yeah.
And I guess it's not as magical as maybe you think, right?
Like it's actually fairly simple, standard engineering stuff, right?
I was just going to say it is magical.
To be honest, like if I look at, let's say the Go runtime, to me, it's still magical
in how fast it is.
Like there's your program and you optimize it
for speed and you want it to be fast but then there's also the runtime that is still faster
right it's the thing that still has to keep up with the thing that you're doing it just
shows up barely in the profiles because it just keeps collecting you know your memory and then
allocating stuff and you know for example a runtime also has to do scheduling.
So if you have green threads, for example, as Go does, which are not processor threads
or operating system threads, but, you know, a virtual construct mapped onto them, then
the scheduler has to actually put them and schedule them onto actual threads.
And if you think about it, that's kind of amazing.
You know, like this is all happening.
You're trying to make your program go as fast as possible. And the overhead of the scheduler is
basically nil. It is there, but you can't ignore it, right? Because it's just so fast, like you
trust it to do the right thing. And for me, that is actually still fascinating, magical.
That's a good point. I feel like though, maybe I'm naive because I haven't looked at the Go
runtime, right? Or very many runtimes at all. But I've looked at the kind of example you built in
your book. And I feel like that there's just a lot of engineering hours between these two, right?
Like you have the basic structure and then everything else is just time, right?
I thought about this a lot actually in the past few weeks that building a language or a runtime
is a rather straightforward thing.
And most of the complexity comes from making it faster,
more efficient, more performant.
You can build a Java interpreter or something
and I don't know, let's ignore all the special cases
and all of that stuff, but it's not a lot of code. If you want to make it scale, then it's a ton of engineering work
and a ton of code. And if you look at, for example, Google's V8, the JavaScript engine,
that is insane. JavaScript itself is a fairly simple language. I would say nowadays,
there's a lot of stuff added onto it. But it's still a fairly simple language. I would say nowadays there's a lot of stuff added onto it,
but it's still a fairly simple language compared to, let's say, C++ or Rust or what have you.
But if you look at V8, this is crazy. There's so much work put into it and there's so much
engineering skill visible. If you just dip your toes in it that it's mind-blowing they have multiple
interpreters they have multiple compilers in it they have so many optimizations and it's all this
this simple thing of let's make javascript faster you know people run javascript on web pages let's
analyze how to write the javascript and let's make that as fast as we can. And to me, it sounds like a lot of fun,
but it's also, it takes a lot of skill.
Yeah, coming at this from a naive perspective,
it sounds great that the use case is so simple.
I feel like I deal with a lot of things
where it's like, there's complicated requirements
and then it's like, just make it faster.
Now, I mean, squeezing that speed out is a challenge, right?
But it's like pure engineering, right?
Rather than like...
So how big does your compiler implementation end up?
I think at the end of the second book,
we're at around 6K lines of code in total,
including the tests.
So it's tiny, tiny, you know, compared to other.
I actually looked it up before this
and the compiler comes, you know compared to i actually looked it up before this and the compiler comes you know
comes in at around 1700 dvm is 1100 and then the whole front end which is the thing that takes in
basically strings and turns it into a data structure the parser the lexer including the ast
that's nearly 2000 lines including including tests. That's crazy.
Yeah.
It's like crazy small, I guess.
Yeah.
But, you know, if you write a book, you learn to shed weight, basically.
Like every line counts.
And I personally think writing these two books, you know, it sounds grand as in it made me a
better programmer, but it's really this,
how can I get rid of this line? How can I make this even simpler? Do I need this function? Do
I need this method? The added requirement is it has to be readable. It has to be easy to
understand. So you cannot do the obfuscation thing where you go, no, no, no, let's just put it in one
line and use a ternary operator or whatever, and make it as small and tiny as possible. Let's just cut the variable names. No, it has to be really simple, easy to understand, and as small as possible.
That's also why I, for example, skipped a bunch of features that a lot of people have asked for,
for example, loops, for loops, while loops. I feel that I could have added this, but it doesn't
add a new thing because once you have functions and once you have conditionals, that means you know that you can conditionally execute code and jump around.
Then you basically have all the tools available to build loops.
It's interesting in some ways.
Yeah, you're describing like the truest readability test, right?
Which is like instead of focusing on lines of code, it's like, no, could somebody pick this up in a book and read it? Yeah. That is so challenging. And I don't want
to sound smart, but a lot of articles or blog posts, what I often see is when people use code
examples, they have variable names that refer to something else, right? If you basically rip out a
code snippet out of your existing code base from work, it refers to, I don't know, administrator or user, which is easy to
understand. Everybody knows what this is. But oftentimes you read through code samples and you
go, why is it called like this? Am I missing something? Is something happening before? Did
I miss another code snippet and stuff like this? And all of this you need to think about like does every example build upon the
previous one does every snippet of code stand alone can a person look at this and understand
it and that was really hard like really really hard have you ever played around with the literate
code or whatever literate code is this idea that did donald knuth invent this i think so right
isn't that why he invented lex i guess or tech it
could be it could be yeah the idea is that you basically write prose as in commons and then
just inject the code in between and this is kind of what i do in a book right the book starts with
zero lines of code and then you explain let's write down the first function in this file
and then you do this illiterate programming would be if you could then take the prose and the code and hit compile,
and it would take the code and put it in the right places and then execute the program.
So I could have used the tools that exist for literate programming as in org mode, for
example, in Emacs that allows you to do this and then use that to basically produce my book.
But I guess that would also be,
that could be a rabbit hole that you never get out of.
So I think like we're way off on a tangent,
but like the thing that requires,
I think the literate code is that your language,
it doesn't have any constraints on ordering basically
because to do like the true literate code,
you want to be able to introduce things
in the order that makes sense to the reader, right?
That is actually really hard,
the order in which you introduce things.
That is, I've revised several chapters of the book
when it's like, oh, this seems like a simple feature
that I can build in chapter two.
And then it's like, oh, in order to build that feature,
we have to explain this first. And in order build that feature, we have to explain this first.
And in order to explain this, we have to do this.
So you kind of try to find your way
to the top of this or the bottom of this tree
and try to find out what is really the starting point
so that you can explain the whole thing
from first principles, right?
Like you need to have everything built
on top of the other thing.
And then at the end, the compiler comes out.
And that was really hard.
And Bob Nystrom, who you mentioned, he also said on Twitter that that is one of the hardest
things about his book too.
And I think Steve Klapnick, who wrote the Rust book, he also said that this was a challenging
part, especially in Rust where Rust is really complex.
And you mentioned building on things.
Like, so this book builds on your previous book
and you mentioned a little bit the front end of a compiler.
Like what's the front end of a compiler?
The front end is, in my wording,
the part of the compiler that takes in the source code
and parses it and turns it into a form that is easy to work with
in the rest of the compiler.
And that form is usually a tree
because if you look at source code
and if you look really closely,
it's actually structured like a tree.
Like, you know, if you look at Intendation, for example,
if you know about tree structures,
you can see how you can represent this as a tree,
as in, hey, and if conditional has a condition, then it has a child node that is the consequence
or the alternative, for example.
And then you can see how you can structure this as a tree.
And the front then takes in the source code a string or a file full of strings and then
runs it through a lexer which turns the
source code into so-called tokens or lexemes and that is just think about it as the things that
you want to work with stripped of all the details so instead of working with characters you work
with tokens for example instead of saying oh this is let for let for example, instead of saying, oh, this is L-E-T for let, for example,
in a let statement, you have a let token, or you have an integer token that is one, two, three,
four, instead of having the single characters. Gets rid of all the white space and stuff like
this. Like if you start with a string, let X equal seven, and then your tokenizer turns it into like a let token an x it's probably a
token of type variable or something yeah with value equals x and then equal sign yeah and then
a number with the value seven and then semicolon or whatever you have and yeah that's basically it
so you get out of let x equals seven you get four tokens for example and then you take, that's basically it. So you get out of let x equals seven, you get four tokens, for example.
And then you take these, that's what the lexer does.
And then you put them into the parser
and the parser turns this into an AST or a syntax tree.
We don't have to go into the details here,
but a tree structure that represents this.
And for example, for the let x equals seven, this could be a tree
node called let statement that has one field called identifier, which would point to the token x,
and then one field called value, which it binds to the identifier. And that would point to the
expression that is seven. And then after that, you can do stuff and turn
this AST into a cleaner form or whatever you want to do. And then it goes through the rest of the
compiler. So like with the previous book and with how far I got into Bob's book, like you basically,
once you have that structure, then you're like, okay, let's run it. So you take that let statement
and you have like a case, you're like, oh, case it's a let statement,
then grab out the name and grab out the value
and throw that in a dictionary.
And then like onto the next, right?
Yeah, that is a really succinct description
of what interpretation means, right?
And you cannot find that in any book.
Like this microphone that you're seeing it,
this is on top of multiple compiler books
and you won't find any paragraph in it that says, and then you just take a dictionary and put the
value. This is really good description. Because yeah, this is what you do if you run it. Like
if you interpret it, you take the AST, you put it in a function called eval, most of the times for
evaluate. And then you say switch, okay, so whatever your language has. My current node,
what is it? Is it a let statement? Okay, if it's a let statement, let's open a dictionary and say
x equals seven. And then let's interpret the next statement. If the next statement is x plus two,
then we say, oh, x, I know x, let's look at the dictionary and take seven from the dictionary
and add two to it. And then the wrapping statement might be print or
puts. That means, oh, now we have to output nine or something. And that's interpretation.
Yeah, it's very succinct. I think, yeah, we've just gotten rid of a whole bunch of books. We
can just replace them.
Yeah, just the last two minutes. Put it for free online. People can learn a ton.
So why do we have compilers and stuff why a second book even
again performance efficiency you know optimizations that's a compiler is basically an optimization of
an interpreter and i guess a lot of people that know about compilers they just got goosebumps no
no you can't say that but you could theoretically do all of the things that you do with an interpreter,
with a compiler or vice versa. That is theoretically speaking. And again, I hope that
nobody listening falls off the chair. There is no difference in their capabilities. But if you
take a compiler, for example, you produce an intermediate result. You translate the code and translate it into
another language. And that allows you to maybe make it faster and get rid of stuff. And if you
think about the example that we just mentioned, where it says, let x equals seven and then x plus
two, let's say that's in a loop, for example, in the code. It says run this, I don't know, 500 times, let x equal whatever.
Then every time you run the loop in your interpreter,
you again have to look at every node and say,
oh, is this a let statement?
Is this an, I don't know, arithmetic expression or something?
Is this an identifier?
Is this a for loop?
Whatever.
You have to do this every time.
And then you can start doing optimizations and you can add caching to it and say, oh, this AST node actually, it caches something
that I can then run again because I know what it is now. And then as soon as you start getting
into it, you realize, wait a second, I'm just going through a program and look at it and analyze
it and translate it
into another form that can be executed. So now you're already doing some form of translation.
And compilation is then taking this further and saying, I'm going to look at this program
and output it in another language. And if I output it in another language, I'm just going to get rid of all the
stuff that I don't need. And I also have to analyze it only once. I don't have to look at everything
all the time, especially not when you then run the program. That's the main thing, right? Compilers
can be slow. You know, I grew up running Linux and I had Gen 2 running. So I basically let my
computer run for the whole night to compile my window manager.
So I do know how slow compilers can be.
And the reason they're slow is because
this is you're paying the upfront cost
that an interpreter pays at runtime.
You're paying a little bit more
because you're doing additional optimizations,
but you're shifting the cost.
And you're shifting the cost to a point in time where
you're happy to pay it. Right. But if you want to execute your program, you don't want to pay this
cost. You want to just run it as fast as possible. And you only have to compile it once and you run
it like a whole bunch of times. Right. Like in theory. Yeah. Yeah, exactly. Just the example I
always use is imagine that the two of us, we speak English and we have a friend that only speaks, I speak German.
So my friend speaks only German, but he doesn't speak English.
And you're saying something to me and I want to translate it for him.
So what I could do is after every sentence that you tell me, I could take a sentence, translate it in my head and then say to my friend, oh, he just said this or he just said that.
Or what I could do, this would be interpretation.
What I could instead do would be to just listen to what you have to say paragraph wise, or
maybe you have prepared a whole speech.
I could listen to the whole speech and write down the German translation and then hand
it to my friend and just say, read this.
This is now in German. You can understand this. And then there wouldn't be this interpretation step because I
just translated, I just compiled what you said into this other language. And now there is no
overhead of me interpreting all the time. That's a really simple example, but that's the gist of it.
And to be honest, as soon as you start getting into the real world concerning compilers and
interpreters, the lines get blurry.
You know, it's blurry.
I like this metaphor.
The advantage in this metaphor would be like, if you have a whole bunch of German friends,
because you could hand the paper out to each of them, right?
Exactly.
Whereas, yeah, because I'm thinking like, if I had some optimizations I could do to Ruby code, right? So I'm going to start up my Rails app. And first, I want to go
through and find all these optimizations. But let's say the optimizations take minutes to find,
right? Yeah. Like if Ruby is not compiled, then maybe the trade off is not worth it,
because my startup time becomes so slow, right? That's the whole point. As in when are you
prepared to pay the cost for making a program run faster? At some point, Ruby worked like we were describing, right? It just
executed statements. And then at some point, this could be wrong, but to my Googling, at some point,
Ruby emitted C code and then executed it. I think this is actually recently. So what happened is Ruby is now in version, it's in 2.6.
And a few years ago with Ruby 1.8, it worked exactly like we just said.
It basically went through your Ruby source code and said, oh, you have a class here,
class called Fubar.
So I'm going to say Fubar the class in my dictionary.
And then you want to say Fubar.new. don't know print first name i'm going to look in
the dictionary what is full bar and you know it parsed the input it built an ast and then
interpreted the ast and then with ruby 1.9 which is also this was around i think 2012 or 11 12 13
something around that and um they switched to a virtual machine.
That means we can go into detail,
but for now, let's just say a virtual machine
is a different form of interpreter,
but a much faster one.
And what they did is they compiled the Ruby AST
into something called bytecode,
which is what the virtual machine runs on.
And then that contains a few optimizations
and then have the VM execute that.
And that was, yeah, seven years ago, roundabout. And what they are working on right now is what
you just said, that they emit C code and then run that. And that is actually incredibly interesting
to me when I first saw this, because this is called a JIT compilation. And a JIT compilation stands for just-in-time
compilation, which means instead of what we said, you compile by writing down the translation and
then you can run it a bunch of times later on, which is also what we do when we compile a program,
for example. Let's say you write a C program, you compile it, you put it in a compiler,
you get out a binary, and then you can run it, I don't know, hours, days, weeks, months, years later.
JIT compilation means you feed your program to the programming language and it says, only
when I need to execute this part of the program, I'm going to compile it.
And then I'm going to execute the output of that compilation process, the result.
And that is just in time, just when I need to execute it.
And the JVM, the Java Virtual Machine,
was one of the first big industrial implementations
of program languages that did this.
And actually, this was seen as a joke,
like you're never going to get this fast.
That's not going to happen.
But as it turns out, now that there's a lot of time that the JVM
can use to first run your program slowly and then compiles it in the background and then executes it.
What Ruby is now doing is they're doing the JIT compilation by basically in the interpreter,
opening a file and writing in the C code that you would run if you were to interpret the code in your Ruby interpreter.
Then they take that file, pass it to the C compiler, produce a dynamic library, like an object file, load that into memory,
and then just call the function they just wrote into a file, just compiled and just loaded into memory.
And that is actually, that gives you...
Yeah, it doesn't sound like it.
It sounds really, do it yourself at home, like a home-built solution, but it actually
provides a speed up.
You pay this cost only once.
And you get all of the benefits that the C compiler that you use offers.
It seems like the takeaway is that this is all a mess.
Like the compilers versus interpreters.
So we just described several different ways
Ruby is compiled,
but it's generally considered to be interpreted.
JVM considered compiled,
but I think even the JVM does like,
it'll back up sometimes.
Like it will compile and then uncompile or something.
It will go back to the bytecode version.
That might be, yeah.
Yeah, yeah, yeah.
I know what you mean.
I totally agree 100%. This is, the lines are blurry, right? And it's this, like, the more you
know, the less you're able to basically give a straight up answer, which is, you know, it's really
sad as in like, hey, what's the difference between an interpreter and a compiler? And you start out
by saying, hey, an interpreter runs your program as soon as you start the interpreter and feed it
a program, your compiler translates it, and then later on you can run it. But in the
actual real world, things are converging and interpreters have compilers built in. Compilers
actually interpret stuff and then compile the result. One thing that a compiler can do,
for example, is to... Like a constant? Exactly.
Let x equals five times three.
Exactly.
Yeah, exactly.
So I'm pretty sure every C compiler that is used in production does this.
So if you write in your C program, int x equals 99 plus one,
then it's not going to do that calculation when the program is running.
It's just going to put the hundred in there and say x equals 100 and that that is also a form of interpretation right it's
an optimization but it's also interpreting the input and then that's interesting yeah then if
you look at what i mentioned earlier v8 they have if i remember correctly it's changing all the time
but they have multiple interpreters they start out if you start a javascript program they
start out with one interpreter that runs your thing immediately because if you think about it
what you as a user want is you want to open a website and have the javascript run as fast as
possible yeah so that is a hard requirement so they run this but at the same time in the background, they start a compiler
that compiles the JavaScript into bytecode, you know, which is the first form of compilation.
And then they have like their virtual machine running that takes the bytecode and executes it.
But then the virtual machine too says, oh, I've now run this 10 times and the user seems to be
staying on this page for the foreseeable future. So I'm now going to optimize it. And then it's going to this other compiler and it's going to compile it into native machine
code.
It's funny because a colleague was telling me about this.
I probably have it wrong, but there's even a layer like below assembly, like in the CPU,
it can be doing these types of optimizations using microcode.
And the level of abstractions is kind of insane.
Yeah, that is, you know, the saying,
turtles all the way down,
as in you're staring into the abyss
and it just only gets darker and deeper.
But yeah, like even the machine code that you emit,
like let's say you're writing assembly language
and you have your assembler and, you know,
I don't know, adding two numbers and outputting the numbers
and then you compile that.
Maybe a stupid example because it's such a simple thing.
But you would think that your computer executes this.
But then it turns out that computers nowadays, or CPUs more specifically, are so complex.
They're systems of their own.
I don't want to say they have their own brain, but they have so much stuff built in that
they actually decide when and how to run your
program so if you have a cpu with multiple cores let's say eight cores then the cpu basically
decides which core is running what and what it also does is just one speculative execution where
yeah the cpu basically looks at the code that you want to
run. And let's say your code is X equals get number from user. And then if X greater than
or equals 10 output foobar, else output, no, whatever. And the CPU will run this code and
it will basically detect that there's the if coming, right?
And then, okay, bad example was that it outputs stuff.
But let's say the thing that you want to do is just add other numbers,
like no side effects, then the CPU will say,
oh, you know, there's a 90% chance that this will be true.
So I'm going to execute the true part
before I even evaluated the actual condition.
Before I even looked at the user input,
I'm going to evaluate the true as if the user said it's greater than 10.
And then if the user doesn't, I'm just going to throw away the result.
Because it's so fast.
Yeah.
Like it's just like I have extra time.
So exactly.
Yeah.
What's the cost?
Yeah.
It's funny.
For years, we had a a tivo and i just switched
cable companies had to get rid of the tivo it's like a pvr right but tivo i don't even know if
they still exist they're very smart about taping things and i remember 15 years ago we first got
our tivo like plugged it in you know it's like a linux box with a hard drive and it just starts
taping shows like it tapes the ghostbusters movie because i don't know i got extra hard drive space
and people like ghostbusters right yeah it's kind of like the CPU is like, I got nothing to do. So maybe we'll hit the SIF statement. I'll
just start executing things.
Yeah, exactly. The problem is the CPU has to be really smart about what it executes because it
might have side effects, right?
Okay. I'm going to bring it way back. So we started, we did the front end, right?
Yeah.
So like what happens?
So I have my AST.
Yeah.
So in your book, what's next, right?
So I guess you're saying, hey, I want to make this more performant.
So instead of executing statement by statement, what do I do?
You compile it to another language, which means translating it to another language.
That means we take in the AST that the parser produced, and then we translate the AST to
another language.
And in the case of the book, that is a language called bytecode for a virtual machine.
And just to give you a short explanation of virtual machine, a virtual machine is basically,
and this is really, again, sorry, simplified version, is a computer built in software,
a CPU built in software. cpu built in software that means
you know we basically started this the wrong way i was just going to say cpu is really simple in
what it does you know i can't really say that after what we just said but the idea is simple
right the cpu takes a instruction a piece of program code from memory it fetches it that's
what it's called then it it decodes it. That
means it looks at it and says, is this, I don't know, an add instructions? Do I need to add a
number? Or is this a sub instructions? Do I need to subtract numbers? And then once it does that,
this is called a decoding step. It executes it. That means it talks to memory, it talks to the
ALU, whatever, and executes the statement. Now, a virtual machine is a piece
of software that has this in software. This is called the CPU cycle, the fetch, decode, execute
cycle. And a virtual machine is doing the same thing in software. Just imagine a loop that says,
hey, give me a instruction. What is this instruction? Execute this instruction. Do the thing that the instruction
says. That's basically it. It's a piece of software that does this. Fetch, decode, execute.
It's modeled after a CPU. That means it also has all of the advantages that a CPU has, which is
universal computation, Turing completeness, stuff like that. And the machine code that the virtual machine is running on
is often called bytecode,
which are these instructions as in add one, add two, whatever.
And this is what we build in the book.
We build a virtual machine
that is a simplified form of a CPU that runs on bytecode.
And the compiler takes the AST that the parser produces and turns
it into bytecode and then feeds it to the virtual machine to execute it. And this sounds like more
work, you know, is this faster, like you translating it to another thing, and then you have to do
execute it again, even after you parse it. But even if we do this while running the program or before we start running the program,
the program gets faster.
At the end of the second book,
the benchmark is, you know,
this is the Fibonacci call,
is actually three times faster
without any other optimizations,
without any profiling,
without any memory optimizations, whatever.
Just by changing the architecture
from an interpreter to a virtual machine, it got three times faster. And the reason is that we touched on this just earlier,
that instead of going through the AST and saying, oh, is this a let statement? Is this an identifier?
Is this an expression? Is this a while loop? Is this a return statement? Instead of doing this
all the time and throwing away the result,
recomputing this information all the time, we compute it once and then encode this information into bytecode, a really simple format that allows us to express the whole program, but in a much
simpler language. And then we feed it to the virtual machine who doesn't even have to do the
whole dance of, oh, what
is a let statement?
What is this?
It doesn't know about let statements anymore.
All it knows about is things like load a number on the stack, load another number on the stack,
add these two together, save the result in slot one, slot two.
You know, it's a really, really simplified version of the program we had earlier where
all the fluff is basically gone, and it's
down to just the computation. So it's like you've made up another language that's even simpler that
you've mapped to? Is that one way of thinking of it? Yeah, exactly. You know, that's what a compiler
does, translating. And this simpler language, bytecode is a simpler language. It's a machine
language for a virtual machine. And let's say you have a compiler
that compiles natively. That means you don't need any other thing like a C compiler. It outputs a
binary. You can run the binary. You can send it to your friend that also runs a Linux computer or
Mac and can also run it. What this compiler does is it takes the C language and translates it into
machine language, which is the machine that your CPU
and your operating system speak.
And what we do in the book, in the compiler for the virtual machine is we translate our
AST into the language that the virtual machine speaks, which is the bytecode that we defined.
Just the same as like Java works, right?
The day one version of the JVM basically, yeah.
Yeah. But yeah, this is the basic idea
behind having a virtual machine.
So what happens if you have,
like if I want to add two numbers together,
what does that end up looking like?
In bytecode, you mean?
Yeah.
There's two categories,
broad categories of virtual machines
or computers in general.
The one is stack machines and the other
one is register machines. And the difference is how these machines use memory. Stack machines say
we do all of our computation in this part of memory that we call a stack and you push values
on it and you pop values off it. And if you want to add two
numbers together, you push the first number, then you push the second number. Now you have two
values on the stack and then you say add, and then we put the result on the stack and then you can
pop that off the stack and put it in somewhere else on a hard drive. I don't know. And a register
machine has this additional region in memory that is addressable, that
allows you to say, put this number in register A, put this number in register 2, put this
number in register 3.
And there's the pros and cons of each approach.
That's a huge discussion.
But generally speaking, people are saying stack machines are easier to build.
If I was writing something that compiled to x86 or something,
like it's easy because, well, maybe it's not easy,
but it can execute on the actual computer architecture.
But in this case, you have to build that virtual machine, right?
Yeah, but if I build a virtual machine in the book
and I write a compiler for this virtual machine
and I compile my input compiler for this virtual machine. And I compile my input programs
for this virtual machine. And the virtual machine itself is written in Go, which can be compiled for
different architectures, AMD, x86, ARM, whatever you have, 32 bit, 64 bit. Go already has that.
And if I built my virtual machine in Go, I can run my virtual machine
on all of these architectures. And I only have to compile my code for my virtual machine.
Now, if you compile to machine code, to native, then things get hairy. Because, for example,
I'm currently writing, actually in my spare time, I'm writing a scheme to x86 compiler.
I'm following resources and x86, you're not going to write the machine
code in binary. What you're going to do is you're going to emit assembly language and then assemble
that into machine code, right? So you decided for an architecture, then you had to make the decision
32 or 64, then you have to decide which assembler to target. And then also, even if you target the
correct assembler, you also need to make sure that the architecture that you want to target is supported by this
assembler. For example, I'm emitting currently x86 32-bit code in assembly language for the
GCC, the GNU Compiler Collection assembler., the code that I emit is perfectly fine to compile
on Linux, but not on Mac, right? Because it's different architecture. So then again, you have
this, oh, no, it's a different thing. Even though we both computers have like AMD 64s, I have the
same assembler, but it's still different architecture. So it doesn't work on both. And it's just a lot of
work.
You know, this is also, I think,
if you look at the biggest compilers
or the most famous compilers like GCC, LLVM,
a lot of the code is in these so-called backends,
you know, which we actually didn't,
or I didn't explain yet.
The backend of a compiler is the thing
that actually takes AST
or another form of another representation
of the input source code and emits
it into this other language, like outputs it, prints it, you know, writes it to a file.
So your backend emits your bytecode.
Exactly.
GCC has a backend that basically fans out to multiple backends where it says,
oh, we're running on FreeBSD and this is a MIPS machine.
In your bytecode, how many instructions do you end up with?
I would say, I don't know, 30 or something, I guess. That's too much. 20, maybe 20, 30. So,
you know, you have things like add, subtract, multiply, equals, equals not jump jump if equals stuff like that but that is tiny compared to
real world stuff i think the jvm has 200 or something there's an upper limit basically
and that is the number of different instructions that you can encode in a byte right and because
you want this to be as small as possible you want the virtual machine to be able to say,
to fetch and decode only one byte
and be able to say, this is this instruction.
If you had to fetch four bytes to be able to say,
oh, this is an add instruction,
that would be catastrophic.
Okay, I never thought of this before.
So it's called bytecode because it's basically using a byte
to store all the possible instructions.
Yeah.
If you have a jump instruction that says jump to,
you know, you need to say where, where do you want to jump?
And then you can say instruction 10, 15, 200, whatever.
And what you want, ideally, to get the best performance
is you want this to be in one byte, right?
So that the machine only has to fetch one byte.
And let's say you use only four bits in a byte to encode the type of instruction.
That means you have four bits left in which you can possibly encode the first operand, right?
So you could have specialized versions of these instructions where you say,
oh, if it's a jump instruction and we call this one, this is the instruction with the number 12.
This is the small jump, which is able to jump to relative positions up to 128, which is
what we can encode in four bits.
Then it's just in one byte.
We take the first four bits, look at them and say, oh, it's a jump tiny, whatever you
want to call it.
That means in the other four bytes is the target of this jump instruction. Then we don't
have to fetch another thing and have to decode it. Wow. Yeah. That's where it gets messy.
This is where, if you think about it, this is where the performance gets won,
stuff like this, right? Because this loop, this fetching, decoding, this is basically all
that the VM does. And your goal is to shrink that so down that the time that your program or that
the VM and your program spend is only spent on your program, not in the overhead of the machine,
which brings us back to the beginning where we talked about to go runtime and how it's so magical
that it's basically just this 1% of execution time is spent to go runtime and how it's so magical that it's
basically just this 1% of execution time is spent in the runtime. And that's your goal. You want the
VM to disappear and only have your program execute. And if we rewind, when we were talking about like
executing off the tree, right? So we would eval by looking at our current node and figuring out
what I should do and execute it. The VM implementation ends up similar, right?
Except with a much smaller instruction set.
I grabbed the instruction.
Oh, if it's an add, then I have to pop twice
and then add them together and then push it back, right?
You're 100% right, which is also why sometimes people,
and this is really not helpful to the whole thing,
people say VMs are
bytecode interpreters. Oh, okay. You know, so that is also another level of confusion added there
because yeah, they're interpreting bytecode. And you know, you can also say CPU is a machine code
interpreter, right? Yeah, that's true. Yeah, that's true. So your book is called Grading a
Compiler, but also you have to write
this implementation. You have to build a virtual machine that can execute the instructions of your
compiler.
That didn't fit in at all. Writing a compiler and a virtual machine, that would be a mess.
So one thing that if I think about it logically, there's something weird where like,
so I take a string, I take a string, I turn it into a series of tokens, like a linear list, right?
I take that list, I turn it into a tree.
Then I take that tree, I turn it back into a linear list of instructions, right?
Yeah, yeah.
But what you're doing is called you're lowering your program from one language to another.
You know, we all have this idea of a high-level language and a low-level language.
And what you basically do is when you translate from your high-level language to this bytecode thing is
you get rid of all the high-level language parts.
It makes sense because if you think about it,
the computer executes this very low-level language.
And then we've come up with all these ways
to write things more human readable.
And so like the compiler is the place where this mapping
between the low level and the high level has to live.
Yeah, it's funny that in,
let's say you were a programmer in the 60s and I wasn't,
but if you were, I'm pretty sure
that you would be knowledgeable
in the low level instructions
that a computer is able to execute.
And you could then, if you look at the other language, like Lisp, for example,
which looks higher level, as in you have recursive function calls,
first class function, stuff like this.
You can see how it's made up out of these building blocks,
because you can then say, oh, you know, a C, for example, has four loops.
Oh, this is just, you know, an if and it's a conditional.
They don't even say if.
It's a conditional and a jump instruction.
You know, I can see how that is built up from these smaller, simpler building blocks.
And nowadays, all of us basically that haven't learned assembly language in college or at
university, we're doing the reverse process.
We're discovering like, oh, my high level language, that's just made up out of these
simple building blocks. And it's a really humbling and fascinating experience, to be honest. Like if
you can see, oh, wait, it all comes down to just this, you know, set of instructions. And then you
realize, oh, wait, this is what Turing Complete is about, you know, that you can express all of the programs that you can basically come up with in your work
with this simple Turing machine, which is what a CPU or a computer is. And then mind blown.
Yeah, because people don't know this anymore. Like I certainly didn't, right? I mean,
some people do who've been around or who do like C programming or device driver stuff.
Yeah, if you do hardware programming,
I guess you know this.
And to be honest, I wouldn't even go so far
and say you need to know this.
I think it certainly doesn't hurt to know this
because as I mentioned earlier,
if you do performance optimizations, for example,
you're at some point gonna touch something beneath the thing you're working on.
And then it's good to know how this lower level basically serves your higher level.
This virtual machine that you write, it's a proxy for this Go runtime or the JVM or whatever, right?
It gives you a way to think about what's happening.
Yeah, exactly. Yeah. And it's also, I mean, there's, I guess there's cases where people, they stumble onto things
where they say they've run a Java program.
Then they look at the bytecode that has been compiled and that is being executed.
And then they look at it and say, oh, there's an additional instruction that's not needed
because we're going to throw this result away anyway.
I had an interview with Brian Cantrell
and he spent a long time like writing operating systems,
probably still does.
So I'm following him on Twitter
and sometimes he posts things where he's,
this C code equals this assembly code for GCC.
Why is this generating this assembly and this, that?
Yeah, yeah.
You know, when Rust turned out faster than C,
then he's looking at the assembly and seeing like,
what are they doing here?
Yeah, exactly, yeah.
Yeah, I think it's crazy cool.
I was intimidated by this for a long time as an, oh my God, these people know so, so
much about all of this cool and fascinating stuff.
And nowadays that I know a little bit more about this stuff and also know some of these
people, I can with confidence say that these people that do compile optimizations and assembly,
whatever you have, they, for example, couldn't build a website that looks as nice as the ones
that I built. It's a different set of skills. Programming is a vast, vast field, right? And
it's the same as I don't have the slightest clue about graphics programming or game programming absolutely no idea right my background
my expertise is web platforms and web applications optimizing scaling them programming languages but
no clue about graphics programming so then does that mean your next book is going to be graphics
programming no i'm actually i'm not super interested in graphics programming. I don't know why. I really like the faceless things,
as in I like compilers, programming languages,
web servers, databases, operating systems.
I don't know why, but it's just fascinating to me.
This basically goes back to me being 15 and being told,
you can run a computer without attaching a monitor to it
and without attaching a keyboard. And it without attaching a keyboard and it's called
a server and i can connect it and ever since then this has been my fascination so yeah so i think
that you mentioned that the whole thing including tests you're under 6 000 lines of code for your
front end back end and everything yeah so what is like how does that compare to the real world
are there bigger compilers smaller compilers a lot of a lot of bigger compilers i actually looked this up when
writing the book to have some numbers to back up the claim of all compilers are these huge
complex beasts and um last i checked gcc has 15 million lines of code. Wow. That is a lot, a lot.
That is crazy.
And LLVM is not too far behind.
I think if I remember correctly, it's slightly smaller,
but it's still in the millions, even more than 10 million, I guess.
As soon as you start adding real world stuff, for example, error reporting, right?
Your parser needs to output things like,
oh, in this file, at this location, this line, this column,
there was an error.
I expected this.
This is, this sounds, well, I don't know.
It depends on the audience who's listening,
but it might sound easy, but it's actually like,
this is the brittle stuff that you need to get right.
And it's, as we can all imagine, I guess,
it's hard to get right.
And then what we touched upon earlier, like guess it's hard to get right. And
then what we touched upon earlier, like the different backends for different architectures,
that's a lot of code optimizations. That's basically this huge thing in the middle of
a compiler that takes up a lot of complexity. And optimizations are basically programs that
take in a program and look at it and say,
how can I make this faster? And they do this by either looking at a tiny part or they need more information and look at the whole thing and then look at the tiny part again. And then they optimize
the tiny part and have to make sure that it's still correct. Like you cannot remove the stuff
that's actually important. You know, stuff like this, correctness, error reporting, you know stuff like this correctness error reporting you know i guess you can imagine
how complex it gets once you add all of the little details that's the upper end i guess of compilers
yeah what's the small like the lower end are their languages as small as monkey yeah i mean
depends on your definition of language but i guess you can write a compiler in you know you can
probably write c compiler in 100 lines of c but then it's just for a really, really tiny subset of C.
It is a compiler as in it takes in C and it outputs assembly language, for example, but
it doesn't compile the whole programming language C, only tiny subset.
It doesn't treat the undefined behavior correctly and stuff like this.
I guess in the real world, one of the most famous small compilers is TCC,
the Tiny C compiler by Fabrice Billard.
He's the guy who also built FFmpeg
and he also built Linux.js.
This was, I don't know if you remember this
or if you ever saw it,
it's like a implementation of Linux in JavaScript
that runs in your browser.
And this
was 10 years ago, if I remember correctly. This was crazy. And this guy basically, he programs,
then he drops this huge bomb, and then he goes back and programs. And he basically built FFmpeg,
this incredibly fast video transcoder collection of utilities, comes back with linux js and he also built something called the
tiny c compiler and this is a fairly complete c compiler as in it compiles basically the complete
c language for i think multiple architectures and it comes in at around 70 000 lines of code which
is it's still a lot but compared to 50 million lines of code, it is tiny.
But if you then say, no, we're only going to support one architecture or just a subset
of C, we're going to ignore the preprocessor, for example, and just C itself, we are only
going to compile that, then you can go much lower.
And there's two projects, the one that I have in mind right now. The one is called C4. If I remember correctly, it is C,
a C compiler in four functions of C, which is kind of a joke as a like, you know, tongue in cheek,
you probably shouldn't have done it in four functions. The thing is, it's really like one
file and you look at it and it has read function that reads in source code. It has a parse function that parses it,
then I guess compile and emit. But there's another project that is 10K lines of code,
so slightly bigger, and it's called 8CC. And I don't know where the 8 come from,
but it's a C compiler that can actually compile itself. So it's self-hosting.
And that is 10,000 lines of code. And it's pretty readable. It's pretty easy to
understand. It's clean C code. There's not a lot of fancy stuff going on. So if you're looking for
something like examples of how to build a C compiler, this is a good thing to look at.
The guy who wrote it, he actually bought my book and wrote me a message on Twitter saying,
hey, I love your book. You're the dude who wrote 8 Seats. Like, why do you need to read my book? And he said, you know, you're my teacher when it comes to
explaining code. Yeah, I really like your books. Somebody was asking me, oh, why this book and not
like the Dragon Book or something? And I was, it's just like code and explaining how the code works.
And then like, here's the code as well. So I like this approach. It's very developer centric.
Yeah, thank you. Happy to hear that. And it's, here's the code as well. So I like this approach. It's very developer centric.
Yeah.
Thank you.
Happy to hear that.
And it's also the main reason why I wrote it, right? Because I need this like a language that is not a toy language.
It's not a 50 line implementation of something, but it's also not a huge complex compiler
with lots of optimizations and stuff.
But like the middle thing explained
line by line with tests, which I really enjoy. Yeah, so I had to write it myself.
Nice. That's a great book.
Thanks.
So thank you so much for joining me. It's been a lot of fun.
Yeah, thanks for inviting me. This was a lot of fun. I really enjoyed it.
That was the show. Thank you for listening to the Code Recursive Podcast.
I'm Adam Gordon-Bell, your host.
If you're listening right now,
reach out and let me know what you thought
or just that you listened.
In the Slack channel,
we have recently been discussing great books
on software design.
As a guest a couple episodes back mentioned,
James Couple,
there really are very few great books on this topic,
which is kind of crazy.
Until next
time, thank you so much for listening.