Algorithms + Data Structures = Programs - Episode 15: Macros Almighty!

Starting point is 00:00:00 I think I'm like the only committee member that's like just thrilled about macros and the preprocessor. Like I love macros. I'm confused on why we're not recording this on like April 1st. Welcome to ADSP the podcast episode 15 recorded on February 20th, 2021. My name is Connor, and today my co-host Bryce tries to convince me why macros are awesome. Yeah, so I want to talk about the C++ preprocessor i i think i'm like the only committee member that's like just thrilled about macros and the preprocessor like i love macros i i bryce low back i'm a member of the c plus plus committee and i am telling you right here and right now i love macros macros are great i mean let me put it to you this way let me make you the

Starting point is 00:01:07 case think of a language feature like a major language feature from the c++ 11 era every one of them at some point was prototyped in some like library only form and boost and it would not have been possible without macros see like macros are amazing they're so powerful they they let you do things that like they let you do syntax hacks that are just not possible uh uh in the regular language and even a hygienic macro facility like some of the ones that we're looking at having right now. Like some could argue that const eval functions, which const eval functions in C++, it's something that's similar to constexpr, but constexpr says this function may be evaluated at compile time as a constant expression. Const eval says this

Starting point is 00:01:59 function must be evaluated at compile time. So it must be evaluated in the equivalent of being inlined fully at compile time. So it's a little bit stronger than constexpr, and it's an error if you use it in a way where it would need to be evaluated at runtime. But even that, it doesn't give you the same power of macros because macros are inherently textual. And so because they're textual, they let you do all sorts of hacks. Now, of course, as somebody who has had some involvement in modules in the past, I'll also tell you that macros are terrible. And the whole textual inclusion model for C++ is a large part of the compile time woes that we have. And while I agree with that, they do have their uses. I wanted to do this episode because recently one of the people in my team,

Starting point is 00:03:07 I needed them to create a new fancy macro abstraction for something, a portability layer around some new thing that we're putting into one of our compilers. And as he got started with it, I started teaching him some of the like dark arts of, of, uh, macro hacking and C plus plus. And, uh, and I introduced him to boost preprocessor. And just as I was doing that, I just remembered my love and passion for, um, uh, macro programming. Um, and I don't know why I love it so much. Because it's kind of horrendous. But I think that that's part of why.

Starting point is 00:03:51 That's part of the appeal. It's like, oh, cool. Like, you know, you would never think that you could do this thing in the preprocessor. But you can. Did you know that the preprocessor is Turing complete? I did not know that. No. Yeahrocessor is Turing complete? I did not know that. No. Yeah, it's Turing complete.

Starting point is 00:04:08 Have you ever heard of Boost preprocessor? I've heard of it, but I don't know what it is. I mean, I assume it's adjacent to the C++ preprocessor, like doing a bunch of stuff with that. It is a prep-processor library so boost pre-processor is um a uh i i would argue it's like the only generic like general purpose library for pre-processor programming um and i think it's it's it's it's probably got to be my favorite Boost library. I will always have a special affinity for Boost Spirit, which I worked on very early in my career.

Starting point is 00:04:53 But Boost Preprocessor is the one Boost library that I still use today. And it's a little bit crazy, all the stuff that you can do with it. First of all, let's talk about data types. You've probably never thought about preprocessor data types, right? No. Okay. Also, just before we get into this, this is going to be good because you have a real hard sell here because I started my career working for a company that I don't want to use the word. I don't want to use that word.

Starting point is 00:05:32 But let's just say that they heavily relied, our code base heavily relied on the preprocessor for different modules that our software system supported. And it led to just crazy things like picture the same function with like four different preprocessor contexts around the function declaration, like the parameter lists. And they're all lined up for so like, you know, module A has, you know, five of the 10 possible function parameters or arguments, module B has six out of the 10. And then they line them all up. And they do it to the extent that there's like 400 character function declarations with like six different contexts. So like, I've been on the bad side of preprocessors. So I'm, you know, good luck, Bryce. Good luck on trying to get me excited about Boost preprocessor.

Starting point is 00:06:35 Show me the magic here. All right. All right. So the first thing that you have to understand about preprocessor programming is that parentheses and commas are special. So because macros are always textual, if you have something, you know, if you call a macro, like if I have a macro called A and I call it with um like i pass in a a template type um where i've got um more than one argument more than one template parameter to it so something like you know a parens x, y, x, open bracket, y, comma, z, you might think that this passes one argument to the macro a, and that argument would be x, brackets, y, comma, z, close brackets. But that's not actually

Starting point is 00:07:40 how it works because it's textual. It doesn't recognize the angle brackets as having any special meaning. So this will pass two arguments to A and those arguments will be X less than Y comma Z greater than symbol. Follow me so far? But parentheses are special in the preprocessor. They do have a special meaning. And so if I was to take that same argument and just surround it in parentheses, then it would be passed through as one argument. So if I took that same expression, I just wrapped it in parentheses, then it would be just a single argument to that macro A. And so this becomes the basis of all of the preprocessor-based data structures that Boost preprocessor supports. So the first one is called an array. And it is, well, actually,

Starting point is 00:08:52 we should start with tuples. Don't you mean tuples? So a tuple, are we going to do this? Tuples? You want me to call it tuples but i don't i mean that's the correct pronunciation of it okay all right tuples all right so a preprocessor tuple is um uh just a comma separated list of elements that are surrounded by parentheses um so like you know, print opening parentheses, one comma, two comma, three closing parentheses. And this is like the most fundamental and basic preprocessor data structure. And it's one of the more useful ones because it's what you would, it's also what you would use as the argument to a function like macro, right? Okay. Okay. Okay. All right. So now let's talk about the next one, the array. So the array is a two element

Starting point is 00:09:54 tuple where the first element is the number of elements in the array. And the second element is another tuple of the actual members of the array. So for example, you know, open parens 2 comma parens x comma y is an array with two elements. The first is x, and the second is y. I actually never really used the array preprocessor data structure that much. Usually I use tuples or the next one, sequences. So a sequence is just like a group of things that are in parentheses. So like opening parentheses, one, closing parentheses, opening parentheses, two, closing parentheses, opening parentheses, three, closing parentheses. So it's just like a group of like parentheses that are right right next to each other make sense i'm following that's uh you could say that i'm following yes just for the listeners in my head currently i'm going

Starting point is 00:11:14 i'm confused on why we're not recording this on like april 1st or if there's like an omega troll coming like i'm just i'm confused because I'm thinking like, what? But keep going. This is, I'm sure you're going to get to, you're a, you know, a top tier marketer. So I'm sure you're going to have a, you know, you're going to wrap this up with a bow and I'll see some value at some point. All right. All right. So next let's talk about a list and uh this one uh should actually be pretty

Starting point is 00:11:48 you know familiar to you list list yeah it's like a lisp list so it's a it's it's like a a con style list um uh with a head and a tail so um if you're familiar with like lists and lisp, it works just like that. If I had a list of like three elements, the way it would work is each, like the list is always a tuple of two elements where the first element is the head of the list and the second element is another list that contains all the rest of the list. So if I list the three elements, it would be like opening parens, one, comma, opening parens, two, comma, opening parens, three, comma. Then you have this special terminator that in boost preprocessor is called boost pp nil. And then you'd have three closing

Starting point is 00:12:46 parens. So this is a data structure that should be familiar to you, right? Yep. I mean, still following. All right. So now you've got some data structures. And the Boost preprocessor library gives you all sorts of useful facilities to working with these data structures and doing things like, you know, apply some macro, apply every element of one of these data structures to some macro, you know, expand them all in a certain way, access certain elements. And so then you can start seeing how you can do some programming. And of course, you don't really have state in this programming model. So it's sort of a purely

Starting point is 00:13:37 functional programming model. Now, there is a way that you can do sort of stateful programming. We'll talk about that a little bit later. It's, I'd argue, the craziest thing. But, you know, the idea is that, like, you pass around everything by value and any modifications that you do is, you know, you add to something, then you, you know, call another macro with it. There's no mutable state that you can work with. Okay. So then the next thing to talk about, which I'd argue is the most useful boost preprocessor function, is a function called, hang on, let me go find it in the reference, called expand. And a lot of people's code base have some version of this. So what expand does is it's a function like macro that takes a single argument and it does double macro expansion on its argument. So this is useful when you want to delay expansion of some macro

Starting point is 00:14:54 until a later time when it's, when it's re-scanned. So for example, if you want to programmatically construct an invocation of some macro, like you, you know, the name of the macro like function that you want to call. But then you want to like construct the arguments to it and then expand it out. And then like, like call that macro, that's what you'd use expand for. Or like if you wanted to do some text pasting, um, cause when it's one of the things you can do in the C++ preprocessor is you can, um, uh, you can paste together tokens. So if I wanted to go in like dispatch to a macro with a certain name um based on like you know basically catting together some other strings um you would use expand to uh uh expand to expand out the name

Starting point is 00:15:56 uh of the thing that you're going to go with and then invoke so this is like the most frequent thing that i would use um uh does that kind that kind of make sense? It does make sense, yeah. Yeah. So once you have like expand in these data structures, you can start doing like a lot of really cool stuff because you can start like, for example, building up, programmatically building up lists of arguments or programmatically iterating through some data. And then you can also do this sort of like REPL type. That's the perfect analogy for expand. Expand is like REPL. So expand is like REPL or like, I forget what it's called in Python, like maybe it's like eval. You know, it's like a function in a dynamic programming language

Starting point is 00:16:47 where you say, hey, give me some text of this programming language, and then I'll evaluate it dynamically. Sorry, pause. What does REPL mean to you? For me, it means read eval print loop. Yeah. That's the same meaning for me.

Starting point is 00:17:06 I'm confused then. REPL is like a way of developing in Python or Haskell where you are just building up expressions slowly and you execute expressions and then you get the value of it back. Right, right. But it's like that idea in... Yes, we're talking about the same thing. I'm sort of using it as an analogy here.

Starting point is 00:17:33 The way that you often use boost, pp, expand is when you're dynamically... Well, dynamically at compile time, you're building up this expression that you then want to evaluate. That's what boost-pp-expand is useful for. It's like I am building up this macro expression that I want to evaluate. And then I use boost-pp-expand to do that evaluation.

Starting point is 00:18:01 I see. This seems, yeah, I think it's more similar to like eval in bash. Right, right. It's more similar to something like val and bash. I forget what it's called in Python, but I think maybe it's called eval. I think it might be eval in Python too. Yeah. Yeah. But eval and bash is a really good example because bash and preprocessor and the C++ preprocessor have a lot in common. And of course, you also have all of your, you have the analogs of all of the C++ control flow statements. So there's boost PP if, which is, you know, just an if condition. There's a few different forms of that.

Starting point is 00:18:48 There is boostpp4, which is just like a for loop. And then there's a whole bunch of operations for doing all of the various built-in math and working with built-in data types like bools and ints and whatnot. Because of course you need that too in any sort of programming framework. Now keep in mind in something like a preprocessor pound if,

Starting point is 00:19:19 you can just use regular C++ style operations. Like if you do, you know, pound if, you know, a less than, you know, zero, like that'll just work. But if you're just in the macro evaluation, you know, that won't work. In a macro evaluation, if you have A less than zero, that's not going to evaluate special because the macro evaluation is just purely working with text. So if you want to do something like less than, you need some special magic for that. And now, Connor, I'm going to show you what some of these things look like. And we're going to see how terrifying you find it. So I'm going to show you

Starting point is 00:20:16 the increment function. This is what the increment function looks like. Oh, goodness. How many lines is this file? Well, so this one supports incrementing up to 256 data types. So a lot of the operations in the Boost preprocessor library will only work for up to like 256 of something. And you might wonder why. That's like a weird limit. And I'll explain and I'll also explain why Connor just laughed.

Starting point is 00:20:56 So Boost.pp.inc is a function that takes some value and it adds one to it. And here's how it does it. There are 256 macros that are called boost, pp, inc, impl, tag, in. So there's like tag zero, tag one, tag two, tag three, tag four, tag five, tag six, all the way up to 256. And for tag 0, the value expands to 1. For tag 1, the value expands to 2. For tag 2, the value expands to 3, etc., up to the last one. And so then what the Boost.ppinc macro does is it takes the argument that you gave it, which is a number, and it pastes that number to the string thrust pp inc impl tag. So if I call it with, you know, five, what it does is it takes that five and it pastes it to the end of a string

Starting point is 00:22:07 thrust PP ink impl tag, and then it expands that macro. And so that would go and dispatch to the macro that's called, you know, underbar tag five, and the value of that macro is 6. So then the result of that macro expansion is 6, which is the result that you would expect if you incremented 5 by 1. This is how all of the basic math and logic operations work in Boost Preprocessor. I'll show you the funnier one, Connor, which is, so there's decrement that we're scrolling through, but okay. So here's bool, thrust PP bool. And this is, it's supposed to be a bool data type, just sort of like, you know, C++'s bool, where you give it some, you know, value and it returns either 1 or 0. And it does the same thing.

Starting point is 00:23:08 Just in the case of bool, the tag values are for tag 0, the value is 0. And for all 255 other tags, the value is 1. There's no better way to do that? There is no better way to do this. There is no better way to do that there is no better way to do this there is no better way to do this because you don't have any you don't have any basic operations like the the in macro evaluation you basically can do like the only real primitive operation that you have is text pacing. So there's the, I mentioned this a few times, there's the double hash operator. So if you do A double hash B within some function like macro,

Starting point is 00:24:00 that will paste the expansion of A to the expansion of B. You have that and you have the ability to do like delayed expansion and that's it. So if you want to build up any more complex operations, you have to build them up based on concatenating together strings, and then expanding them out. So every operation has to be implemented table-based like this. But this actually gets to what I think is my favorite macro trick. And it's the one that like I use most frequently. And it's the thing that I think is, it's just one of the more natural and approachable things that you don't have to be doing anything, anything sort of out there like I would do with the preprocessor to appreciate this. And it is this pattern for variadic, or not variadic, for, well, I call it

Starting point is 00:25:12 macro overloading. And it's a style of variadic macro where you can write a macro in a way where you can have one implementation if it takes, you know, one argument, one implementation if it takes two arguments, one implementation if it takes three arguments, etc. So it's sort of like overloading. So here I'm showing you what the pattern looks like. So the idea is like, let's say that I wanted to have some macro called, well, the example here is plus, where I wanted it to support taking an arbitrary number of things and then like adding it together. this magic thing that in our library we call thrust PP dispatch, what you can do is you write your main function, the name of your interface, in this case plus, and plus is going to be a variadic macro. But it's not going to actually contain any of the implementation logic.

Starting point is 00:26:21 It's just going to have this call to thrust PP dispatch. And that call, it'll take as its first argument, the first part of a name. So usually you make it the name of the function itself. And then it'll take all the variadic arguments. And then you write a series of implementation functions. So I write a function called like plus, or a macro called plus zero that takes zero arguments, a macro called plus one that takes one argument, a macro called plus two that takes two arguments,

Starting point is 00:27:02 a macro called plus three that takes two arguments, a macro called plus three that takes three arguments. And thrust PP dispatch will, when you call it, it'll count the number of arguments that it was called with, the number of variadic arguments that you passed in. And then it will paste that number to the name that you gave it as your first argument. And then it will invoke the concatenated together name with those arguments. So if I call plus my variadic macro with two arguments, it will call plus two with two arguments. So I find this to be an incredibly useful pattern for working with variadic arguments because you can do things like optional parameters very easily here. Right. That if I want to have a macro that can take, you know, up to three arguments and the last argument can always be optional. It's very easy and simple to do with this and I can do it all under one name.

Starting point is 00:28:15 And that's really the nice part of this is that just like in C++ I can have sort of like an overload set under the same name oh man my head hurts that's where we'll end part one of this conversation tune in next week for part two of macros almighty

Algorithms + Data Structures = Programs - Episode 15: Macros Almighty!

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.