CoRecursive: Coding Stories - Tech Talk: Compiling to Bytecode with Thorsten Ball

Episode Date: September 1, 2019

Tech Talks are in-depth technical discussions. What do compilers do? What is the runtime of a language? What does it mean to compile something down to bytecode and what executes the byte code. Throste...n Ball Answers these questions in this interview with Adam. "A virtual machine is a computer built-in software, a CPU built-in software" "Compilers can be slow. You know, I grew up running Linux and I had Gentoo running, so I basically let my computer run for the whole night to compile my window manager. So I do know how slow compilers can be and the reason they're slow is because you're paying the upfront costs that an interpreter pays at runtime. You're paying a little bit more because you're doing additional optimizations. You're shifting the cost to a point in time where you're happy to pay it." Writing a compiler in GO GCC Codebase Mirror LLVM Codebase TCC Compiler C in 4 functions 8CC - small self hosting compiler https://corecursive.com/037-thorsten-ball-compilers/  

Transcript
Discussion (0)
Starting point is 00:00:00 A compiler is basically an optimization of an interpreter that is theoretically speaking. And again, I hope that nobody listening falls off the chair. There is no difference in their capabilities. The example I always use is imagine that the two of us, we speak English and we have a friend that only speaks, I speak German. So my friend speaks only German, but he doesn't speak English. And you're saying something to me and I want to translate it for him. So what I could do is after every sentence that you tell me, I could take a sentence,
Starting point is 00:00:33 translate it in my head and then say to my friend, oh, he just said this or he just said that. Or what I could do, this would be interpretation. What I could instead do would be to just listen to what you have to say paragraph wise, or maybe you have prepared a whole speech. I could listen to the whole speech and write down the German translation and then hand it to my friend and just say, read this. This is now in German.
Starting point is 00:00:56 You can understand this. And then there wouldn't be this interpretation step because I just translated, I just compiled what you said into this other language. And now there's no overhead of me interpreting all the time. That's a really simple example, but that's a gist of it. And to be honest, as soon as you start getting into the real world concerning compilers and interpreters, the lines get blurry. Hello, this is Adam Gordon-Bell. Join me as I learn about building software. This is Code Recursive.
Starting point is 00:01:31 Today, I talked to Thorsten Ball. He wrote this book called Building a Compiler in Go. We talk about virtual machines and bytecode and compilers, basically things that were always just kind of magic to me. This interview is fun and it's casual and we hit on a lot of interesting low-level topics like how do CPUs work and why does Java and C-sharp target a virtual machine-based runtime. We also talk about how do you write readable code, like so readable that it can stand alone in a book format? Thorson has the answers for all of these questions. And because he comes from a perspective of a web developer, he's really good at explaining these things
Starting point is 00:02:12 in a way that I can understand. Thorson, I'm pretty excited to have you here because I really liked your first book. And now you have, you've had a second book for a while actually, haven't you? But I just started reading the second book. Yeah, thanks. It's been out since August, like a year now, basically.
Starting point is 00:02:32 August last year. And like, this is our second interview. I previously interviewed you on SE Radio. So I think I'm on a theme here. So I interviewed you on Software Engine Radio about interpreters. And I talked to Bob Nystrom about interpreters. And like, so this is the third installment. So I want to talk about compilers.
Starting point is 00:02:50 So your book is called Building a Compiler in Go. So my first question, I guess, is like, why build a compiler? Because it's fun. Because compilers are cool, fascinating, mysterious, powerful. I don't know. That's what drew me to compilers.
Starting point is 00:03:04 But I think in the practical sense, there's two answers. The first one is, if you have to build a programming language, and you're working on a team that builds a programming language, then of course, you should know or learn how to build a compiler. And for me, the actual reason was to learn, you know, to basically get rid of a huge blind spot that I had as a college dropout, self-taught developer. I never had visited a compiler course, so I never learned this. And this was always this black spot for me, like this mysterious thing. And everybody speaks with a lot of respect, you know, when talking about compilers and I wanted to learn how they work. And then I read a bunch of blog
Starting point is 00:03:53 posts actually that recommended that it's a really valuable thing to do, because if you learn how compilers work, you actually learn how a lot of things work in other tools. That means you learn how parsing works. You learn what a lexer is. You learn what an AST is, an abstract syntax tree. You learn what an immediate representation is, IR, like let's say another form of AST. And you also learn how computers work, kind of, because you have to think about what a compiler does, which is translating from one language into the other. That gives you a lot of context. And that's the main thing, like learning for the sake of learning. That's why I did it. But there's also a really practical aspect to it. And that is, you know, I kind of touched upon
Starting point is 00:04:40 this just now. But if you learn how to build a compiler or learn about compilers, you'll see a lot of different things in a different light in your day-to-day work. Like you recognize these things that you built previously or that you need to build now, you now recognize that, oh, that is actually a parser. Oh, that is actually a form of immediate representation. And you also gain a lot of understanding of how languages work, which helps tremendously if you're debugging,
Starting point is 00:05:11 profiling, if you're doing performance optimizations. And this was two weeks ago, actually, when me and two colleagues were profiling and doing performance optimizations on a database that we use. And while looking at the profiling data, performance optimizations on a database that we use. And while looking at the profiling data, and this was a program written in Go, one of my colleagues said, you know what, you need to know about how the runtime works, how the language works, if you want to make any sense of what you're looking at. And I feel that all the time. It might be this, I don't know what the word for it is, but as soon as you learn about a concept,
Starting point is 00:05:47 you can see it everywhere and everything is in a different light. And you think, oh, how could I have known this without knowing this other thing that I just learned about? But I do feel that there is a great value for learning, you know, in learning about compilers and program language,
Starting point is 00:06:01 even if you never built an actual general purpose programming language. That's very cool. What was the example where you needed to understand how the runtime works? It's an open source project called Zookt built by Google. And it's a database that indexes source code. And I work at Sourcegraph and we index source code a lot because we're building a search
Starting point is 00:06:24 engine and code intelligence platform for source code. So we use Zookt and we also extend it for our purposes. And we had to analyze and profile it. And then it's like, you're looking at the profiling data, you're looking at flame graphs, you're looking at graphs. That means you have to understand what is a stack? What is a call stack? If you don't know what a call stack is,
Starting point is 00:06:44 it's really tough to make sense of flame graphs. And if you don't know what a garbage collector is, it makes a lot harder to understand what the runtime, what is it called, malloc-gc call is that Go has. This is basically what shows up as a GC in your profiling thing. And then if you go deeper and try to understand what is actually happening, then you kind of start reasoning about, oh, wait, yeah, we're allocating so much memory here. This is why this profile is actually dominated by the garbage collector. It's not about speed, like we're not slow and runtime is not slow. It's just the runtime has so much stuff to do because we produce so much memory garbage that it has to collect. And it's this reasoning about and understanding of what is actually happening that allows you to then take a step back and draw the right, hopefully right conclusion about what to optimize and what not to optimize. Yeah, because like certain things seem like magic to me. And I guess that since I started looking through your book, that's one thing that always seemed kind of magical. What does a compiler do?
Starting point is 00:07:45 Or what does a runtime do? Yeah. And I guess it's not as magical as maybe you think, right? Like it's actually fairly simple, standard engineering stuff, right? I was just going to say it is magical. To be honest, like if I look at, let's say the Go runtime, to me, it's still magical in how fast it is. Like there's your program and you optimize it
Starting point is 00:08:07 for speed and you want it to be fast but then there's also the runtime that is still faster right it's the thing that still has to keep up with the thing that you're doing it just shows up barely in the profiles because it just keeps collecting you know your memory and then allocating stuff and you know for example a runtime also has to do scheduling. So if you have green threads, for example, as Go does, which are not processor threads or operating system threads, but, you know, a virtual construct mapped onto them, then the scheduler has to actually put them and schedule them onto actual threads. And if you think about it, that's kind of amazing.
Starting point is 00:08:43 You know, like this is all happening. You're trying to make your program go as fast as possible. And the overhead of the scheduler is basically nil. It is there, but you can't ignore it, right? Because it's just so fast, like you trust it to do the right thing. And for me, that is actually still fascinating, magical. That's a good point. I feel like though, maybe I'm naive because I haven't looked at the Go runtime, right? Or very many runtimes at all. But I've looked at the kind of example you built in your book. And I feel like that there's just a lot of engineering hours between these two, right? Like you have the basic structure and then everything else is just time, right?
Starting point is 00:09:22 I thought about this a lot actually in the past few weeks that building a language or a runtime is a rather straightforward thing. And most of the complexity comes from making it faster, more efficient, more performant. You can build a Java interpreter or something and I don't know, let's ignore all the special cases and all of that stuff, but it's not a lot of code. If you want to make it scale, then it's a ton of engineering work and a ton of code. And if you look at, for example, Google's V8, the JavaScript engine,
Starting point is 00:09:57 that is insane. JavaScript itself is a fairly simple language. I would say nowadays, there's a lot of stuff added onto it. But it's still a fairly simple language. I would say nowadays there's a lot of stuff added onto it, but it's still a fairly simple language compared to, let's say, C++ or Rust or what have you. But if you look at V8, this is crazy. There's so much work put into it and there's so much engineering skill visible. If you just dip your toes in it that it's mind-blowing they have multiple interpreters they have multiple compilers in it they have so many optimizations and it's all this this simple thing of let's make javascript faster you know people run javascript on web pages let's analyze how to write the javascript and let's make that as fast as we can. And to me, it sounds like a lot of fun,
Starting point is 00:10:46 but it's also, it takes a lot of skill. Yeah, coming at this from a naive perspective, it sounds great that the use case is so simple. I feel like I deal with a lot of things where it's like, there's complicated requirements and then it's like, just make it faster. Now, I mean, squeezing that speed out is a challenge, right? But it's like pure engineering, right?
Starting point is 00:11:06 Rather than like... So how big does your compiler implementation end up? I think at the end of the second book, we're at around 6K lines of code in total, including the tests. So it's tiny, tiny, you know, compared to other. I actually looked it up before this and the compiler comes, you know compared to i actually looked it up before this and the compiler comes you know
Starting point is 00:11:26 comes in at around 1700 dvm is 1100 and then the whole front end which is the thing that takes in basically strings and turns it into a data structure the parser the lexer including the ast that's nearly 2000 lines including including tests. That's crazy. Yeah. It's like crazy small, I guess. Yeah. But, you know, if you write a book, you learn to shed weight, basically. Like every line counts.
Starting point is 00:11:57 And I personally think writing these two books, you know, it sounds grand as in it made me a better programmer, but it's really this, how can I get rid of this line? How can I make this even simpler? Do I need this function? Do I need this method? The added requirement is it has to be readable. It has to be easy to understand. So you cannot do the obfuscation thing where you go, no, no, no, let's just put it in one line and use a ternary operator or whatever, and make it as small and tiny as possible. Let's just cut the variable names. No, it has to be really simple, easy to understand, and as small as possible. That's also why I, for example, skipped a bunch of features that a lot of people have asked for, for example, loops, for loops, while loops. I feel that I could have added this, but it doesn't
Starting point is 00:12:43 add a new thing because once you have functions and once you have conditionals, that means you know that you can conditionally execute code and jump around. Then you basically have all the tools available to build loops. It's interesting in some ways. Yeah, you're describing like the truest readability test, right? Which is like instead of focusing on lines of code, it's like, no, could somebody pick this up in a book and read it? Yeah. That is so challenging. And I don't want to sound smart, but a lot of articles or blog posts, what I often see is when people use code examples, they have variable names that refer to something else, right? If you basically rip out a code snippet out of your existing code base from work, it refers to, I don't know, administrator or user, which is easy to
Starting point is 00:13:29 understand. Everybody knows what this is. But oftentimes you read through code samples and you go, why is it called like this? Am I missing something? Is something happening before? Did I miss another code snippet and stuff like this? And all of this you need to think about like does every example build upon the previous one does every snippet of code stand alone can a person look at this and understand it and that was really hard like really really hard have you ever played around with the literate code or whatever literate code is this idea that did donald knuth invent this i think so right isn't that why he invented lex i guess or tech it could be it could be yeah the idea is that you basically write prose as in commons and then
Starting point is 00:14:12 just inject the code in between and this is kind of what i do in a book right the book starts with zero lines of code and then you explain let's write down the first function in this file and then you do this illiterate programming would be if you could then take the prose and the code and hit compile, and it would take the code and put it in the right places and then execute the program. So I could have used the tools that exist for literate programming as in org mode, for example, in Emacs that allows you to do this and then use that to basically produce my book. But I guess that would also be, that could be a rabbit hole that you never get out of.
Starting point is 00:14:53 So I think like we're way off on a tangent, but like the thing that requires, I think the literate code is that your language, it doesn't have any constraints on ordering basically because to do like the true literate code, you want to be able to introduce things in the order that makes sense to the reader, right? That is actually really hard,
Starting point is 00:15:12 the order in which you introduce things. That is, I've revised several chapters of the book when it's like, oh, this seems like a simple feature that I can build in chapter two. And then it's like, oh, in order to build that feature, we have to explain this first. And in order build that feature, we have to explain this first. And in order to explain this, we have to do this. So you kind of try to find your way
Starting point is 00:15:30 to the top of this or the bottom of this tree and try to find out what is really the starting point so that you can explain the whole thing from first principles, right? Like you need to have everything built on top of the other thing. And then at the end, the compiler comes out. And that was really hard.
Starting point is 00:15:48 And Bob Nystrom, who you mentioned, he also said on Twitter that that is one of the hardest things about his book too. And I think Steve Klapnick, who wrote the Rust book, he also said that this was a challenging part, especially in Rust where Rust is really complex. And you mentioned building on things. Like, so this book builds on your previous book and you mentioned a little bit the front end of a compiler. Like what's the front end of a compiler?
Starting point is 00:16:15 The front end is, in my wording, the part of the compiler that takes in the source code and parses it and turns it into a form that is easy to work with in the rest of the compiler. And that form is usually a tree because if you look at source code and if you look really closely, it's actually structured like a tree.
Starting point is 00:16:38 Like, you know, if you look at Intendation, for example, if you know about tree structures, you can see how you can represent this as a tree, as in, hey, and if conditional has a condition, then it has a child node that is the consequence or the alternative, for example. And then you can see how you can structure this as a tree. And the front then takes in the source code a string or a file full of strings and then runs it through a lexer which turns the
Starting point is 00:17:06 source code into so-called tokens or lexemes and that is just think about it as the things that you want to work with stripped of all the details so instead of working with characters you work with tokens for example instead of saying oh this is let for let for example, instead of saying, oh, this is L-E-T for let, for example, in a let statement, you have a let token, or you have an integer token that is one, two, three, four, instead of having the single characters. Gets rid of all the white space and stuff like this. Like if you start with a string, let X equal seven, and then your tokenizer turns it into like a let token an x it's probably a token of type variable or something yeah with value equals x and then equal sign yeah and then a number with the value seven and then semicolon or whatever you have and yeah that's basically it
Starting point is 00:18:01 so you get out of let x equals seven you get four tokens for example and then you take, that's basically it. So you get out of let x equals seven, you get four tokens, for example. And then you take these, that's what the lexer does. And then you put them into the parser and the parser turns this into an AST or a syntax tree. We don't have to go into the details here, but a tree structure that represents this. And for example, for the let x equals seven, this could be a tree node called let statement that has one field called identifier, which would point to the token x,
Starting point is 00:18:34 and then one field called value, which it binds to the identifier. And that would point to the expression that is seven. And then after that, you can do stuff and turn this AST into a cleaner form or whatever you want to do. And then it goes through the rest of the compiler. So like with the previous book and with how far I got into Bob's book, like you basically, once you have that structure, then you're like, okay, let's run it. So you take that let statement and you have like a case, you're like, oh, case it's a let statement, then grab out the name and grab out the value and throw that in a dictionary.
Starting point is 00:19:10 And then like onto the next, right? Yeah, that is a really succinct description of what interpretation means, right? And you cannot find that in any book. Like this microphone that you're seeing it, this is on top of multiple compiler books and you won't find any paragraph in it that says, and then you just take a dictionary and put the value. This is really good description. Because yeah, this is what you do if you run it. Like
Starting point is 00:19:35 if you interpret it, you take the AST, you put it in a function called eval, most of the times for evaluate. And then you say switch, okay, so whatever your language has. My current node, what is it? Is it a let statement? Okay, if it's a let statement, let's open a dictionary and say x equals seven. And then let's interpret the next statement. If the next statement is x plus two, then we say, oh, x, I know x, let's look at the dictionary and take seven from the dictionary and add two to it. And then the wrapping statement might be print or puts. That means, oh, now we have to output nine or something. And that's interpretation. Yeah, it's very succinct. I think, yeah, we've just gotten rid of a whole bunch of books. We
Starting point is 00:20:16 can just replace them. Yeah, just the last two minutes. Put it for free online. People can learn a ton. So why do we have compilers and stuff why a second book even again performance efficiency you know optimizations that's a compiler is basically an optimization of an interpreter and i guess a lot of people that know about compilers they just got goosebumps no no you can't say that but you could theoretically do all of the things that you do with an interpreter, with a compiler or vice versa. That is theoretically speaking. And again, I hope that nobody listening falls off the chair. There is no difference in their capabilities. But if you
Starting point is 00:20:59 take a compiler, for example, you produce an intermediate result. You translate the code and translate it into another language. And that allows you to maybe make it faster and get rid of stuff. And if you think about the example that we just mentioned, where it says, let x equals seven and then x plus two, let's say that's in a loop, for example, in the code. It says run this, I don't know, 500 times, let x equal whatever. Then every time you run the loop in your interpreter, you again have to look at every node and say, oh, is this a let statement? Is this an, I don't know, arithmetic expression or something?
Starting point is 00:21:38 Is this an identifier? Is this a for loop? Whatever. You have to do this every time. And then you can start doing optimizations and you can add caching to it and say, oh, this AST node actually, it caches something that I can then run again because I know what it is now. And then as soon as you start getting into it, you realize, wait a second, I'm just going through a program and look at it and analyze it and translate it
Starting point is 00:22:05 into another form that can be executed. So now you're already doing some form of translation. And compilation is then taking this further and saying, I'm going to look at this program and output it in another language. And if I output it in another language, I'm just going to get rid of all the stuff that I don't need. And I also have to analyze it only once. I don't have to look at everything all the time, especially not when you then run the program. That's the main thing, right? Compilers can be slow. You know, I grew up running Linux and I had Gen 2 running. So I basically let my computer run for the whole night to compile my window manager. So I do know how slow compilers can be.
Starting point is 00:22:49 And the reason they're slow is because this is you're paying the upfront cost that an interpreter pays at runtime. You're paying a little bit more because you're doing additional optimizations, but you're shifting the cost. And you're shifting the cost to a point in time where you're happy to pay it. Right. But if you want to execute your program, you don't want to pay this
Starting point is 00:23:11 cost. You want to just run it as fast as possible. And you only have to compile it once and you run it like a whole bunch of times. Right. Like in theory. Yeah. Yeah, exactly. Just the example I always use is imagine that the two of us, we speak English and we have a friend that only speaks, I speak German. So my friend speaks only German, but he doesn't speak English. And you're saying something to me and I want to translate it for him. So what I could do is after every sentence that you tell me, I could take a sentence, translate it in my head and then say to my friend, oh, he just said this or he just said that. Or what I could do, this would be interpretation. What I could instead do would be to just listen to what you have to say paragraph wise, or
Starting point is 00:23:54 maybe you have prepared a whole speech. I could listen to the whole speech and write down the German translation and then hand it to my friend and just say, read this. This is now in German. You can understand this. And then there wouldn't be this interpretation step because I just translated, I just compiled what you said into this other language. And now there is no overhead of me interpreting all the time. That's a really simple example, but that's the gist of it. And to be honest, as soon as you start getting into the real world concerning compilers and interpreters, the lines get blurry.
Starting point is 00:24:29 You know, it's blurry. I like this metaphor. The advantage in this metaphor would be like, if you have a whole bunch of German friends, because you could hand the paper out to each of them, right? Exactly. Whereas, yeah, because I'm thinking like, if I had some optimizations I could do to Ruby code, right? So I'm going to start up my Rails app. And first, I want to go through and find all these optimizations. But let's say the optimizations take minutes to find, right? Yeah. Like if Ruby is not compiled, then maybe the trade off is not worth it,
Starting point is 00:24:58 because my startup time becomes so slow, right? That's the whole point. As in when are you prepared to pay the cost for making a program run faster? At some point, Ruby worked like we were describing, right? It just executed statements. And then at some point, this could be wrong, but to my Googling, at some point, Ruby emitted C code and then executed it. I think this is actually recently. So what happened is Ruby is now in version, it's in 2.6. And a few years ago with Ruby 1.8, it worked exactly like we just said. It basically went through your Ruby source code and said, oh, you have a class here, class called Fubar. So I'm going to say Fubar the class in my dictionary.
Starting point is 00:25:40 And then you want to say Fubar.new. don't know print first name i'm going to look in the dictionary what is full bar and you know it parsed the input it built an ast and then interpreted the ast and then with ruby 1.9 which is also this was around i think 2012 or 11 12 13 something around that and um they switched to a virtual machine. That means we can go into detail, but for now, let's just say a virtual machine is a different form of interpreter, but a much faster one.
Starting point is 00:26:14 And what they did is they compiled the Ruby AST into something called bytecode, which is what the virtual machine runs on. And then that contains a few optimizations and then have the VM execute that. And that was, yeah, seven years ago, roundabout. And what they are working on right now is what you just said, that they emit C code and then run that. And that is actually incredibly interesting to me when I first saw this, because this is called a JIT compilation. And a JIT compilation stands for just-in-time
Starting point is 00:26:45 compilation, which means instead of what we said, you compile by writing down the translation and then you can run it a bunch of times later on, which is also what we do when we compile a program, for example. Let's say you write a C program, you compile it, you put it in a compiler, you get out a binary, and then you can run it, I don't know, hours, days, weeks, months, years later. JIT compilation means you feed your program to the programming language and it says, only when I need to execute this part of the program, I'm going to compile it. And then I'm going to execute the output of that compilation process, the result. And that is just in time, just when I need to execute it.
Starting point is 00:27:27 And the JVM, the Java Virtual Machine, was one of the first big industrial implementations of program languages that did this. And actually, this was seen as a joke, like you're never going to get this fast. That's not going to happen. But as it turns out, now that there's a lot of time that the JVM can use to first run your program slowly and then compiles it in the background and then executes it.
Starting point is 00:27:51 What Ruby is now doing is they're doing the JIT compilation by basically in the interpreter, opening a file and writing in the C code that you would run if you were to interpret the code in your Ruby interpreter. Then they take that file, pass it to the C compiler, produce a dynamic library, like an object file, load that into memory, and then just call the function they just wrote into a file, just compiled and just loaded into memory. And that is actually, that gives you... Yeah, it doesn't sound like it. It sounds really, do it yourself at home, like a home-built solution, but it actually provides a speed up.
Starting point is 00:28:32 You pay this cost only once. And you get all of the benefits that the C compiler that you use offers. It seems like the takeaway is that this is all a mess. Like the compilers versus interpreters. So we just described several different ways Ruby is compiled, but it's generally considered to be interpreted. JVM considered compiled,
Starting point is 00:28:53 but I think even the JVM does like, it'll back up sometimes. Like it will compile and then uncompile or something. It will go back to the bytecode version. That might be, yeah. Yeah, yeah, yeah. I know what you mean. I totally agree 100%. This is, the lines are blurry, right? And it's this, like, the more you
Starting point is 00:29:11 know, the less you're able to basically give a straight up answer, which is, you know, it's really sad as in like, hey, what's the difference between an interpreter and a compiler? And you start out by saying, hey, an interpreter runs your program as soon as you start the interpreter and feed it a program, your compiler translates it, and then later on you can run it. But in the actual real world, things are converging and interpreters have compilers built in. Compilers actually interpret stuff and then compile the result. One thing that a compiler can do, for example, is to... Like a constant? Exactly. Let x equals five times three.
Starting point is 00:29:48 Exactly. Yeah, exactly. So I'm pretty sure every C compiler that is used in production does this. So if you write in your C program, int x equals 99 plus one, then it's not going to do that calculation when the program is running. It's just going to put the hundred in there and say x equals 100 and that that is also a form of interpretation right it's an optimization but it's also interpreting the input and then that's interesting yeah then if you look at what i mentioned earlier v8 they have if i remember correctly it's changing all the time
Starting point is 00:30:20 but they have multiple interpreters they start out if you start a javascript program they start out with one interpreter that runs your thing immediately because if you think about it what you as a user want is you want to open a website and have the javascript run as fast as possible yeah so that is a hard requirement so they run this but at the same time in the background, they start a compiler that compiles the JavaScript into bytecode, you know, which is the first form of compilation. And then they have like their virtual machine running that takes the bytecode and executes it. But then the virtual machine too says, oh, I've now run this 10 times and the user seems to be staying on this page for the foreseeable future. So I'm now going to optimize it. And then it's going to this other compiler and it's going to compile it into native machine
Starting point is 00:31:08 code. It's funny because a colleague was telling me about this. I probably have it wrong, but there's even a layer like below assembly, like in the CPU, it can be doing these types of optimizations using microcode. And the level of abstractions is kind of insane. Yeah, that is, you know, the saying, turtles all the way down, as in you're staring into the abyss
Starting point is 00:31:29 and it just only gets darker and deeper. But yeah, like even the machine code that you emit, like let's say you're writing assembly language and you have your assembler and, you know, I don't know, adding two numbers and outputting the numbers and then you compile that. Maybe a stupid example because it's such a simple thing. But you would think that your computer executes this.
Starting point is 00:31:49 But then it turns out that computers nowadays, or CPUs more specifically, are so complex. They're systems of their own. I don't want to say they have their own brain, but they have so much stuff built in that they actually decide when and how to run your program so if you have a cpu with multiple cores let's say eight cores then the cpu basically decides which core is running what and what it also does is just one speculative execution where yeah the cpu basically looks at the code that you want to run. And let's say your code is X equals get number from user. And then if X greater than
Starting point is 00:32:34 or equals 10 output foobar, else output, no, whatever. And the CPU will run this code and it will basically detect that there's the if coming, right? And then, okay, bad example was that it outputs stuff. But let's say the thing that you want to do is just add other numbers, like no side effects, then the CPU will say, oh, you know, there's a 90% chance that this will be true. So I'm going to execute the true part before I even evaluated the actual condition.
Starting point is 00:33:07 Before I even looked at the user input, I'm going to evaluate the true as if the user said it's greater than 10. And then if the user doesn't, I'm just going to throw away the result. Because it's so fast. Yeah. Like it's just like I have extra time. So exactly. Yeah.
Starting point is 00:33:21 What's the cost? Yeah. It's funny. For years, we had a a tivo and i just switched cable companies had to get rid of the tivo it's like a pvr right but tivo i don't even know if they still exist they're very smart about taping things and i remember 15 years ago we first got our tivo like plugged it in you know it's like a linux box with a hard drive and it just starts taping shows like it tapes the ghostbusters movie because i don't know i got extra hard drive space
Starting point is 00:33:42 and people like ghostbusters right yeah it's kind of like the CPU is like, I got nothing to do. So maybe we'll hit the SIF statement. I'll just start executing things. Yeah, exactly. The problem is the CPU has to be really smart about what it executes because it might have side effects, right? Okay. I'm going to bring it way back. So we started, we did the front end, right? Yeah. So like what happens? So I have my AST.
Starting point is 00:34:07 Yeah. So in your book, what's next, right? So I guess you're saying, hey, I want to make this more performant. So instead of executing statement by statement, what do I do? You compile it to another language, which means translating it to another language. That means we take in the AST that the parser produced, and then we translate the AST to another language. And in the case of the book, that is a language called bytecode for a virtual machine.
Starting point is 00:34:33 And just to give you a short explanation of virtual machine, a virtual machine is basically, and this is really, again, sorry, simplified version, is a computer built in software, a CPU built in software. cpu built in software that means you know we basically started this the wrong way i was just going to say cpu is really simple in what it does you know i can't really say that after what we just said but the idea is simple right the cpu takes a instruction a piece of program code from memory it fetches it that's what it's called then it it decodes it. That means it looks at it and says, is this, I don't know, an add instructions? Do I need to add a
Starting point is 00:35:09 number? Or is this a sub instructions? Do I need to subtract numbers? And then once it does that, this is called a decoding step. It executes it. That means it talks to memory, it talks to the ALU, whatever, and executes the statement. Now, a virtual machine is a piece of software that has this in software. This is called the CPU cycle, the fetch, decode, execute cycle. And a virtual machine is doing the same thing in software. Just imagine a loop that says, hey, give me a instruction. What is this instruction? Execute this instruction. Do the thing that the instruction says. That's basically it. It's a piece of software that does this. Fetch, decode, execute. It's modeled after a CPU. That means it also has all of the advantages that a CPU has, which is
Starting point is 00:35:59 universal computation, Turing completeness, stuff like that. And the machine code that the virtual machine is running on is often called bytecode, which are these instructions as in add one, add two, whatever. And this is what we build in the book. We build a virtual machine that is a simplified form of a CPU that runs on bytecode. And the compiler takes the AST that the parser produces and turns it into bytecode and then feeds it to the virtual machine to execute it. And this sounds like more
Starting point is 00:36:33 work, you know, is this faster, like you translating it to another thing, and then you have to do execute it again, even after you parse it. But even if we do this while running the program or before we start running the program, the program gets faster. At the end of the second book, the benchmark is, you know, this is the Fibonacci call, is actually three times faster without any other optimizations,
Starting point is 00:36:57 without any profiling, without any memory optimizations, whatever. Just by changing the architecture from an interpreter to a virtual machine, it got three times faster. And the reason is that we touched on this just earlier, that instead of going through the AST and saying, oh, is this a let statement? Is this an identifier? Is this an expression? Is this a while loop? Is this a return statement? Instead of doing this all the time and throwing away the result, recomputing this information all the time, we compute it once and then encode this information into bytecode, a really simple format that allows us to express the whole program, but in a much
Starting point is 00:37:38 simpler language. And then we feed it to the virtual machine who doesn't even have to do the whole dance of, oh, what is a let statement? What is this? It doesn't know about let statements anymore. All it knows about is things like load a number on the stack, load another number on the stack, add these two together, save the result in slot one, slot two. You know, it's a really, really simplified version of the program we had earlier where
Starting point is 00:38:03 all the fluff is basically gone, and it's down to just the computation. So it's like you've made up another language that's even simpler that you've mapped to? Is that one way of thinking of it? Yeah, exactly. You know, that's what a compiler does, translating. And this simpler language, bytecode is a simpler language. It's a machine language for a virtual machine. And let's say you have a compiler that compiles natively. That means you don't need any other thing like a C compiler. It outputs a binary. You can run the binary. You can send it to your friend that also runs a Linux computer or Mac and can also run it. What this compiler does is it takes the C language and translates it into
Starting point is 00:38:42 machine language, which is the machine that your CPU and your operating system speak. And what we do in the book, in the compiler for the virtual machine is we translate our AST into the language that the virtual machine speaks, which is the bytecode that we defined. Just the same as like Java works, right? The day one version of the JVM basically, yeah. Yeah. But yeah, this is the basic idea behind having a virtual machine.
Starting point is 00:39:09 So what happens if you have, like if I want to add two numbers together, what does that end up looking like? In bytecode, you mean? Yeah. There's two categories, broad categories of virtual machines or computers in general.
Starting point is 00:39:24 The one is stack machines and the other one is register machines. And the difference is how these machines use memory. Stack machines say we do all of our computation in this part of memory that we call a stack and you push values on it and you pop values off it. And if you want to add two numbers together, you push the first number, then you push the second number. Now you have two values on the stack and then you say add, and then we put the result on the stack and then you can pop that off the stack and put it in somewhere else on a hard drive. I don't know. And a register machine has this additional region in memory that is addressable, that
Starting point is 00:40:06 allows you to say, put this number in register A, put this number in register 2, put this number in register 3. And there's the pros and cons of each approach. That's a huge discussion. But generally speaking, people are saying stack machines are easier to build. If I was writing something that compiled to x86 or something, like it's easy because, well, maybe it's not easy, but it can execute on the actual computer architecture.
Starting point is 00:40:34 But in this case, you have to build that virtual machine, right? Yeah, but if I build a virtual machine in the book and I write a compiler for this virtual machine and I compile my input compiler for this virtual machine. And I compile my input programs for this virtual machine. And the virtual machine itself is written in Go, which can be compiled for different architectures, AMD, x86, ARM, whatever you have, 32 bit, 64 bit. Go already has that. And if I built my virtual machine in Go, I can run my virtual machine on all of these architectures. And I only have to compile my code for my virtual machine.
Starting point is 00:41:10 Now, if you compile to machine code, to native, then things get hairy. Because, for example, I'm currently writing, actually in my spare time, I'm writing a scheme to x86 compiler. I'm following resources and x86, you're not going to write the machine code in binary. What you're going to do is you're going to emit assembly language and then assemble that into machine code, right? So you decided for an architecture, then you had to make the decision 32 or 64, then you have to decide which assembler to target. And then also, even if you target the correct assembler, you also need to make sure that the architecture that you want to target is supported by this assembler. For example, I'm emitting currently x86 32-bit code in assembly language for the
Starting point is 00:41:58 GCC, the GNU Compiler Collection assembler., the code that I emit is perfectly fine to compile on Linux, but not on Mac, right? Because it's different architecture. So then again, you have this, oh, no, it's a different thing. Even though we both computers have like AMD 64s, I have the same assembler, but it's still different architecture. So it doesn't work on both. And it's just a lot of work. You know, this is also, I think, if you look at the biggest compilers or the most famous compilers like GCC, LLVM,
Starting point is 00:42:31 a lot of the code is in these so-called backends, you know, which we actually didn't, or I didn't explain yet. The backend of a compiler is the thing that actually takes AST or another form of another representation of the input source code and emits it into this other language, like outputs it, prints it, you know, writes it to a file.
Starting point is 00:42:52 So your backend emits your bytecode. Exactly. GCC has a backend that basically fans out to multiple backends where it says, oh, we're running on FreeBSD and this is a MIPS machine. In your bytecode, how many instructions do you end up with? I would say, I don't know, 30 or something, I guess. That's too much. 20, maybe 20, 30. So, you know, you have things like add, subtract, multiply, equals, equals not jump jump if equals stuff like that but that is tiny compared to real world stuff i think the jvm has 200 or something there's an upper limit basically
Starting point is 00:43:34 and that is the number of different instructions that you can encode in a byte right and because you want this to be as small as possible you want the virtual machine to be able to say, to fetch and decode only one byte and be able to say, this is this instruction. If you had to fetch four bytes to be able to say, oh, this is an add instruction, that would be catastrophic. Okay, I never thought of this before.
Starting point is 00:44:01 So it's called bytecode because it's basically using a byte to store all the possible instructions. Yeah. If you have a jump instruction that says jump to, you know, you need to say where, where do you want to jump? And then you can say instruction 10, 15, 200, whatever. And what you want, ideally, to get the best performance is you want this to be in one byte, right?
Starting point is 00:44:24 So that the machine only has to fetch one byte. And let's say you use only four bits in a byte to encode the type of instruction. That means you have four bits left in which you can possibly encode the first operand, right? So you could have specialized versions of these instructions where you say, oh, if it's a jump instruction and we call this one, this is the instruction with the number 12. This is the small jump, which is able to jump to relative positions up to 128, which is what we can encode in four bits. Then it's just in one byte.
Starting point is 00:44:59 We take the first four bits, look at them and say, oh, it's a jump tiny, whatever you want to call it. That means in the other four bytes is the target of this jump instruction. Then we don't have to fetch another thing and have to decode it. Wow. Yeah. That's where it gets messy. This is where, if you think about it, this is where the performance gets won, stuff like this, right? Because this loop, this fetching, decoding, this is basically all that the VM does. And your goal is to shrink that so down that the time that your program or that the VM and your program spend is only spent on your program, not in the overhead of the machine,
Starting point is 00:45:39 which brings us back to the beginning where we talked about to go runtime and how it's so magical that it's basically just this 1% of execution time is spent to go runtime and how it's so magical that it's basically just this 1% of execution time is spent in the runtime. And that's your goal. You want the VM to disappear and only have your program execute. And if we rewind, when we were talking about like executing off the tree, right? So we would eval by looking at our current node and figuring out what I should do and execute it. The VM implementation ends up similar, right? Except with a much smaller instruction set. I grabbed the instruction.
Starting point is 00:46:11 Oh, if it's an add, then I have to pop twice and then add them together and then push it back, right? You're 100% right, which is also why sometimes people, and this is really not helpful to the whole thing, people say VMs are bytecode interpreters. Oh, okay. You know, so that is also another level of confusion added there because yeah, they're interpreting bytecode. And you know, you can also say CPU is a machine code interpreter, right? Yeah, that's true. Yeah, that's true. So your book is called Grading a
Starting point is 00:46:43 Compiler, but also you have to write this implementation. You have to build a virtual machine that can execute the instructions of your compiler. That didn't fit in at all. Writing a compiler and a virtual machine, that would be a mess. So one thing that if I think about it logically, there's something weird where like, so I take a string, I take a string, I turn it into a series of tokens, like a linear list, right? I take that list, I turn it into a tree. Then I take that tree, I turn it back into a linear list of instructions, right?
Starting point is 00:47:13 Yeah, yeah. But what you're doing is called you're lowering your program from one language to another. You know, we all have this idea of a high-level language and a low-level language. And what you basically do is when you translate from your high-level language to this bytecode thing is you get rid of all the high-level language parts. It makes sense because if you think about it, the computer executes this very low-level language. And then we've come up with all these ways
Starting point is 00:47:39 to write things more human readable. And so like the compiler is the place where this mapping between the low level and the high level has to live. Yeah, it's funny that in, let's say you were a programmer in the 60s and I wasn't, but if you were, I'm pretty sure that you would be knowledgeable in the low level instructions
Starting point is 00:48:02 that a computer is able to execute. And you could then, if you look at the other language, like Lisp, for example, which looks higher level, as in you have recursive function calls, first class function, stuff like this. You can see how it's made up out of these building blocks, because you can then say, oh, you know, a C, for example, has four loops. Oh, this is just, you know, an if and it's a conditional. They don't even say if.
Starting point is 00:48:28 It's a conditional and a jump instruction. You know, I can see how that is built up from these smaller, simpler building blocks. And nowadays, all of us basically that haven't learned assembly language in college or at university, we're doing the reverse process. We're discovering like, oh, my high level language, that's just made up out of these simple building blocks. And it's a really humbling and fascinating experience, to be honest. Like if you can see, oh, wait, it all comes down to just this, you know, set of instructions. And then you realize, oh, wait, this is what Turing Complete is about, you know, that you can express all of the programs that you can basically come up with in your work
Starting point is 00:49:09 with this simple Turing machine, which is what a CPU or a computer is. And then mind blown. Yeah, because people don't know this anymore. Like I certainly didn't, right? I mean, some people do who've been around or who do like C programming or device driver stuff. Yeah, if you do hardware programming, I guess you know this. And to be honest, I wouldn't even go so far and say you need to know this. I think it certainly doesn't hurt to know this
Starting point is 00:49:39 because as I mentioned earlier, if you do performance optimizations, for example, you're at some point gonna touch something beneath the thing you're working on. And then it's good to know how this lower level basically serves your higher level. This virtual machine that you write, it's a proxy for this Go runtime or the JVM or whatever, right? It gives you a way to think about what's happening. Yeah, exactly. Yeah. And it's also, I mean, there's, I guess there's cases where people, they stumble onto things where they say they've run a Java program.
Starting point is 00:50:11 Then they look at the bytecode that has been compiled and that is being executed. And then they look at it and say, oh, there's an additional instruction that's not needed because we're going to throw this result away anyway. I had an interview with Brian Cantrell and he spent a long time like writing operating systems, probably still does. So I'm following him on Twitter and sometimes he posts things where he's,
Starting point is 00:50:32 this C code equals this assembly code for GCC. Why is this generating this assembly and this, that? Yeah, yeah. You know, when Rust turned out faster than C, then he's looking at the assembly and seeing like, what are they doing here? Yeah, exactly, yeah. Yeah, I think it's crazy cool.
Starting point is 00:50:45 I was intimidated by this for a long time as an, oh my God, these people know so, so much about all of this cool and fascinating stuff. And nowadays that I know a little bit more about this stuff and also know some of these people, I can with confidence say that these people that do compile optimizations and assembly, whatever you have, they, for example, couldn't build a website that looks as nice as the ones that I built. It's a different set of skills. Programming is a vast, vast field, right? And it's the same as I don't have the slightest clue about graphics programming or game programming absolutely no idea right my background my expertise is web platforms and web applications optimizing scaling them programming languages but
Starting point is 00:51:34 no clue about graphics programming so then does that mean your next book is going to be graphics programming no i'm actually i'm not super interested in graphics programming. I don't know why. I really like the faceless things, as in I like compilers, programming languages, web servers, databases, operating systems. I don't know why, but it's just fascinating to me. This basically goes back to me being 15 and being told, you can run a computer without attaching a monitor to it and without attaching a keyboard. And it without attaching a keyboard and it's called
Starting point is 00:52:06 a server and i can connect it and ever since then this has been my fascination so yeah so i think that you mentioned that the whole thing including tests you're under 6 000 lines of code for your front end back end and everything yeah so what is like how does that compare to the real world are there bigger compilers smaller compilers a lot of a lot of bigger compilers i actually looked this up when writing the book to have some numbers to back up the claim of all compilers are these huge complex beasts and um last i checked gcc has 15 million lines of code. Wow. That is a lot, a lot. That is crazy. And LLVM is not too far behind.
Starting point is 00:52:51 I think if I remember correctly, it's slightly smaller, but it's still in the millions, even more than 10 million, I guess. As soon as you start adding real world stuff, for example, error reporting, right? Your parser needs to output things like, oh, in this file, at this location, this line, this column, there was an error. I expected this. This is, this sounds, well, I don't know.
Starting point is 00:53:14 It depends on the audience who's listening, but it might sound easy, but it's actually like, this is the brittle stuff that you need to get right. And it's, as we can all imagine, I guess, it's hard to get right. And then what we touched upon earlier, like guess it's hard to get right. And then what we touched upon earlier, like the different backends for different architectures, that's a lot of code optimizations. That's basically this huge thing in the middle of
Starting point is 00:53:35 a compiler that takes up a lot of complexity. And optimizations are basically programs that take in a program and look at it and say, how can I make this faster? And they do this by either looking at a tiny part or they need more information and look at the whole thing and then look at the tiny part again. And then they optimize the tiny part and have to make sure that it's still correct. Like you cannot remove the stuff that's actually important. You know, stuff like this, correctness, error reporting, you know stuff like this correctness error reporting you know i guess you can imagine how complex it gets once you add all of the little details that's the upper end i guess of compilers yeah what's the small like the lower end are their languages as small as monkey yeah i mean depends on your definition of language but i guess you can write a compiler in you know you can
Starting point is 00:54:23 probably write c compiler in 100 lines of c but then it's just for a really, really tiny subset of C. It is a compiler as in it takes in C and it outputs assembly language, for example, but it doesn't compile the whole programming language C, only tiny subset. It doesn't treat the undefined behavior correctly and stuff like this. I guess in the real world, one of the most famous small compilers is TCC, the Tiny C compiler by Fabrice Billard. He's the guy who also built FFmpeg and he also built Linux.js.
Starting point is 00:54:58 This was, I don't know if you remember this or if you ever saw it, it's like a implementation of Linux in JavaScript that runs in your browser. And this was 10 years ago, if I remember correctly. This was crazy. And this guy basically, he programs, then he drops this huge bomb, and then he goes back and programs. And he basically built FFmpeg, this incredibly fast video transcoder collection of utilities, comes back with linux js and he also built something called the
Starting point is 00:55:26 tiny c compiler and this is a fairly complete c compiler as in it compiles basically the complete c language for i think multiple architectures and it comes in at around 70 000 lines of code which is it's still a lot but compared to 50 million lines of code, it is tiny. But if you then say, no, we're only going to support one architecture or just a subset of C, we're going to ignore the preprocessor, for example, and just C itself, we are only going to compile that, then you can go much lower. And there's two projects, the one that I have in mind right now. The one is called C4. If I remember correctly, it is C, a C compiler in four functions of C, which is kind of a joke as a like, you know, tongue in cheek,
Starting point is 00:56:13 you probably shouldn't have done it in four functions. The thing is, it's really like one file and you look at it and it has read function that reads in source code. It has a parse function that parses it, then I guess compile and emit. But there's another project that is 10K lines of code, so slightly bigger, and it's called 8CC. And I don't know where the 8 come from, but it's a C compiler that can actually compile itself. So it's self-hosting. And that is 10,000 lines of code. And it's pretty readable. It's pretty easy to understand. It's clean C code. There's not a lot of fancy stuff going on. So if you're looking for something like examples of how to build a C compiler, this is a good thing to look at.
Starting point is 00:56:56 The guy who wrote it, he actually bought my book and wrote me a message on Twitter saying, hey, I love your book. You're the dude who wrote 8 Seats. Like, why do you need to read my book? And he said, you know, you're my teacher when it comes to explaining code. Yeah, I really like your books. Somebody was asking me, oh, why this book and not like the Dragon Book or something? And I was, it's just like code and explaining how the code works. And then like, here's the code as well. So I like this approach. It's very developer centric. Yeah, thank you. Happy to hear that. And it's, here's the code as well. So I like this approach. It's very developer centric. Yeah. Thank you.
Starting point is 00:57:26 Happy to hear that. And it's also the main reason why I wrote it, right? Because I need this like a language that is not a toy language. It's not a 50 line implementation of something, but it's also not a huge complex compiler with lots of optimizations and stuff. But like the middle thing explained line by line with tests, which I really enjoy. Yeah, so I had to write it myself. Nice. That's a great book. Thanks.
Starting point is 00:57:53 So thank you so much for joining me. It's been a lot of fun. Yeah, thanks for inviting me. This was a lot of fun. I really enjoyed it. That was the show. Thank you for listening to the Code Recursive Podcast. I'm Adam Gordon-Bell, your host. If you're listening right now, reach out and let me know what you thought or just that you listened. In the Slack channel,
Starting point is 00:58:13 we have recently been discussing great books on software design. As a guest a couple episodes back mentioned, James Couple, there really are very few great books on this topic, which is kind of crazy. Until next time, thank you so much for listening.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.