Two's Complement - Compile-Time Programming (with Hana Dusíková)

Episode Date: February 20, 2022

Ben and Matt are joined by Hana Dusíková and discuss panoramic photographs, Matt's career peak, and compile-time programming, including her ground-breaking regular expression library. Links from the... show: Hana's Panoramic photos CTRE library Hana's slides

Transcript
Discussion (0)
Starting point is 00:00:00 I'm Matt Godbolt. And I'm Ben Rady. And this is Two's Compliment, a programming podcast. Hey, Ben. Hey, Matt. How are you doing? I'm doing great. Glad to hear it.
Starting point is 00:00:24 So we've been talking a bit about performance, and I've realized that kind of my career has taken me from hacking around at assembly through to trying to coerce compilers into doing all the work for me, right? That seems to be the journey that I've taken. And I realized that there are people still that are much, much better at making the compiler do all the heavy lifting for for us than I am and so I figured we should talk to somebody who can tell us all about the kinds of things you can do if you really really work hard at the
Starting point is 00:00:57 compiler and and some of the tricks you can do with C++. Okay yeah well that sounds cool. So and so we have a guest today we have a guest we have a guest there's another window on our screen which no one else can see obviously but so surprise to nobody on this call but surprise to our listeners we have a very special guest today uh we're joined by hana dusikova hi hana hi how you doing thank Thank you so much for agreeing to do this. We are excited to have you here. It's too late here, but it's great. And the last few days was kind of up and down, so I'm great. Oh, good.
Starting point is 00:01:34 I'm glad to hear that. And thank you as well for dealing with the time zone difference. We're happily in our lovely early afternoon, whereas you're way into your evening. So we appreciate you taking the time for then. So I just want to introduce you to our guests. Our guests? No. I want to introduce us to our listeners. You are our guests.
Starting point is 00:01:52 You can see how often we do this. So, Hanna, you're a staff scientist at Avast, and you represent the Czech national body for C++, right? Is that the right way to say that? Yeah, that's correct. And also I'm an officer in the C++, right? Is that the right way to say that? Yeah, that's correct. And also I'm an officer in C++ committee. I'm chairing SG7, which is a study group for compound programming and static reflection.
Starting point is 00:02:15 That's not a sci-fi series, right? No, no. I figured. Maybe it's a prequel to Terminator, probably. Well, we'll get to that in a minute, I'm sure. But yeah, I'm always confused because, what is it, WG21 is the C++ Standards Committee? Is that what its name is?
Starting point is 00:02:32 Yeah, it's Working Group 21. And then SG is Special Group? Study Group. Study Group. Okay, so you're SG7, which is Compile Time Programming, and what was the second part? Static Reflection. Static Reflection. Wow, these all sound like very exciting things which is compile time programming and what was the second part and static reflection static reflection wow these all sound like very exciting things that i want now in my programming language
Starting point is 00:02:50 it's really it's really a boring topic it's it but that's now you tell us we just invited you here to talk about this kind of well we talk about all things but um we talk about a lot of the boring stuff on this podcast other people seem to find it interesting i don't understand why but this is just matt and i talking about the stuff that we always talk about i mean uh we can just listen to matt talking and it's exciting i think i think at cppcon matt you did you did like 30 minutes uh stand-up comedy i don't think that's true i mean maybe that you just thought that my my presentation was just so funny but uh no well yeah i don't know you did great because you cannot like you didn't have any feedback and you just go and go well normally the feedback is please stop talking
Starting point is 00:03:42 we've had enough or at least that's what I read on people's faces. No. Bless you. So my understanding is when you're not hacking on C++ and dealing with all the committee stuff, you're also a keen photographer and a Greyhound owner. But there's something cool about the kind of photography you do, right? Yeah, I'm doing panoramic photography.
Starting point is 00:04:00 It's like these spherical pictures, like before Google Earth before Google Earth was cool actually right uh-huh and it's quite technique technical it's not like it's not like typical photography you need to have like tripod you need to have like a special head for that and also white white lens and you need to take i think usually 30 pictures and make sure no one is moving or no not me i'm moving uh like going between uh pictures because then someone is cut in half and it looks or you get two copies of the dog that was running at the same time through
Starting point is 00:04:41 your picture right yeah exactly but you can also erase them. If they move between pictures, there is always some overlap. So you can remove almost the whole crowd. That's amazing. And if you have time and patience. Which strikes me, if you're writing C++, especially the kind of template- c++ that perhaps or you one writes uh yeah you might have a lot of spare time and patience yeah right so what what got you into
Starting point is 00:05:12 panoramic photographing because like panoramic i remember once taking photographs like with my my my handheld and like using some photos panorama stitching software to kind of put it together is that the same kind of thing but just more yeah it's actually uh really similar and it's the technical part is to rotate your lens around a nodal point which in a lens so there is no parallax error and then it's really simple oh i see so this is like this the center of the tripod creates a point, and the lens is always around that, as opposed to just turning the lens and then the point. Oh, okay.
Starting point is 00:05:53 That makes sense. That's interesting. Usually when you do panorama with a mobile phone, you take it in your hand and do stuff like this. Yeah. It's hard to like to describe in podcast yeah no but the thing where you hold your phone out and slowly pan it from one side to the other yeah it makes error but actually google is using it in android to uh extract uh depth from
Starting point is 00:06:17 picture from pictures oh gosh interesting i had no idea they were doing so much like i just take photographs and then every now and then google photos says hey i made a panorama out of four photographs you just happened to take together that looked like they could have been put together. And I have no idea how they do that. Yeah, but, like, my approach is a bit different. It's not machine learning at all. It's just processing power. And it's a lot of gigapixels.
Starting point is 00:06:43 Yeah, you've sent me some images before that have barely almost crashed my browser because they're so huge. So what kind of resolutions are we talking about? We are talking about limits of a JPEG. Right. JPEGs have limits? Yeah, it's 16,000 something and it depends on implementation. So you went 16 somewhere as like a width and height. Interesting. Well, today I learned. And there is actually a funny thing. Not every implementation supports the whole specification of JPEG.
Starting point is 00:07:15 And it depends about the last five pixels. That's the difference between good implementation and bad implementation. The last five pixels is in the middle of the square in JPEG. The macro blocks, right? They're 16 by 16 or whatever. Yeah, and not every implementation is able to handle it. How peculiar. I would have thought this was a completely solved problem by now,
Starting point is 00:07:40 but obviously not. Yeah, I found out when I tried to like tried to uh load jpeg in uh apple preview and it was only in only gray nothing else that's amazing i just thought you know a jpeg is a jpeg i know it's a complicated format and i'm really excited by like the clever tricks that jpeg uses to sort of get the compression levels that it does but i didn't realize that it was much space left over for them to to get things wrong turns out this computing luck is pretty difficult after all getting things right yeah then uh then uh i asked sean parent uh that's like sean parent of adobe photoshop fame right yeah yeah yeah you. You should invite him here too. Yeah.
Starting point is 00:08:25 Yeah, he has always some story. And he told me that he makes sure that the DJ pack and decoder in Photoshop makes all the corner graces. And it was really hard. So have you been sending him test images now? Or was it already handling all of these things, right? Not to him, but to Tony Van for 221 years yep he's working on a projection and he asked me for some big picture so i sent him one or one of them and
Starting point is 00:08:52 he told me uh it crashed their uh application and he needed some fixes that's that's really i mean images right i know you know every now and as i say like you crash browsers that you load up a big image like i'm fond of of looking at these high resolution, similar panorama, actually similar technology, funnily enough, where people have decapped silicon. So you use acid to etch off the top of the chip and then you use a microscope and you take loads and loads of photographs really highly zoomed in. And then you stitch together the die shot of the of the chip and it's super cool because now you can start reading off how the chip actually worked for you know like 6502 era technology at least and at least that's the stuff that i can tractably look at and get some idea what's going on but those images typically are you know they're pngs and they are you know 60 60 000 by 60 000
Starting point is 00:09:43 and so yeah you load them into your browser. And of course, the browser immediately shrinks it down because it knows it can't fit that into memory. But then it's kind of coming down like an old school modem, you know, 14K4 modem. You're like, why on earth in this day and age is it taking so long for an image to come down? And then it kind of goes boom, and it's like Chrome's dead.
Starting point is 00:09:59 And you're like, oh, I see. Or sometimes, again, in Apple Preview, you can open the image. It's totally fine. And when you zoom it, it's suddenly gray. And if you unzoom it, it's still gray. Oh, and gray is like, I've stopped responding gray. Yeah, probably.
Starting point is 00:10:17 I see. Right, that's like, I know Ubuntu does this thing where sometimes if the application is not listening anymore, it's like, well, I'm going to render the last picture you had, but in black and white to show the user that you're not the picture's there but the application isn't yeah it's funny i mean it is i mean it's really interesting these things because you know we in our sort of day jobs we well i speak for myself and maybe for ben here and obviously hannah your jobs are probably very different but like memory is kind of something we don't worry about as much anymore.
Starting point is 00:10:47 We work on server-side stuff for trading, so we just put more RAM in the machines and we're fine, right? But when you're dealing with images that are as large as that, you have to write your software differently. You don't necessarily want to or can even rely on, say, the virtual memory system doing a good job because it can't possibly page things in and out as well as you can if you know how your software is going to access the
Starting point is 00:11:10 image in the first place and i don't think i'd ever really thought about that i think this applies to every domain if you know if you knew a semantic of your algorithm well enough you will always be better than some generic algorithm i guess so i guess so and that's probably true so talking of knowing more about your domain um and a very gramophone reed needle scratch noise uh segue yeah um one of the really cool things that i know you do uh is um work on as we say to the in the intro some some compile time programming things and you're most well known for or at least i'm i know you most well through uh the regular expression library the compile time regular that's not true i'm actually most well
Starting point is 00:11:56 known for my slides that is also true yes but the slides if i'm not wrong are the things that explain the tricks that you use in your ctre library yeah so yeah we should all right we should let you know we'll put some some things in the notes about this but hannah's slides are absolutely legendary and have basically changed the way people do presentations in the c++ space because they are super interactive and she'll show like a parse tree of a you know what looks like a photograph or a picture a nice rendered picture of a parse tree of a regular expression including like this is this node this type that here's the multiple of this one or whatever that kind of like tree view and then everyone's like oh this is cool now we can see how this works and then she'll like single
Starting point is 00:12:38 step through a matching of string and different things are highlighted like so so far so you just hack this in powerpoint and you've just done all it manually and then she'll just edit live give me a give me a regular expression and she'll type it in and the whole tree changes and it's like live updating all the stuff it's the best so that's pretty amazing i mean yeah it's definitely something to aspire to i know when i did a redo of our presentation on memory accesses, I spent a whole bunch of time trying to do similar highlighting of like, this is where the cache comes in and this bit lights up and then this cache line lights across like this. You remember that presentation, right?
Starting point is 00:13:14 Yes, I do. It's a pale imitation of what Hanna is able to do. So, yeah. Hanna, how are you doing all of that stuff? I'm actually programming it. Right. So I implemented a really simple version and a really hockey version of expression matching in javascript too just to be
Starting point is 00:13:32 able to hook them into like into d3 library to draw the nice diagrams and visualize it and i spent a lot of like month you re-implemented the same thing you'd written in C++ in your own particular style which we'll talk about in a second about why it's so cool that you've done it that way you wrote it in JavaScript again just so that you could draw the slides to then tell you as to how you did the C++ version yeah because it was less work than making slides like with some animations because then I change something and i need to like redo everything oh no no way right that's i i've still i was gonna say and i guess because of the nature of the library it would be difficult to like transpile that right like with um the what is
Starting point is 00:14:18 the thing that i'm thinking of graph is no no no no no like the the regular expression you re-implemented the regular expression library to make this work in javascript yes yes so like there i don't know why i'm blanking on the name of this thing right now but there's this the tool that you can use to to basically compile uh oh c++ into javascript so like wasm and stuff um yeah yeah that's what i'm thinking of right yeah that thing like now i'm blanking on it yeah right m scripton and so on yes yes but the whole point is is that you're doing compiler tricks to make that work so it's like the chances of that being supported by the you know wasm compiler is basically zero the thing is the wasm compiler will happily do it
Starting point is 00:15:01 it's just that the everything happens at compile. So you have to actually put the compiler in the web page in order to then actually change it. Of course, yes. It's too late. Who would be daft enough to put a compiler on a web page anyway? So actually, for visualization of the compile time library, you need to have runtime library with hooks to draw it. So it's a different machine. Tell us about the compile time nature of it.
Starting point is 00:15:27 And bear in mind, I'm pretty good at this C++ stuff. Ben has got a working knowledge, and we don't really assume that our listeners have much of a thing. We try and draw them from a wide community. So what is compile time programming? Okay. And why should we care know what why should we care i guess okay uh since i learned about c++ at first i didn't like templates this is because
Starting point is 00:15:51 this is why we can be friends super complex hard yeah exactly and uh then uh i read somewhere that they are actually touring complete so and uh someone in some article jokingly said, you should be able to make regular expression in templates. And it bothered me in my mind somehow. How I would do that? And I was thinking like, through many years, I was like experimenting with some parts only like,
Starting point is 00:16:22 then I finally was able to match only string against another string and then I learned more and more and then suddenly I was able to match regular expression but not regular expression in string but express as a template
Starting point is 00:16:39 like the expression tree of the regular expression but encoded as a template type name like the expression tree of the regular expression, but encoded as a template type name in C++. So it's like repeat, angle brackets, string, character one, character B, character C. Got it. So you had to kind of, as a programmer writing the regular expression, you actually had to write the encoded regular expression
Starting point is 00:17:06 as the tree that it would ultimately be rather than just the string that we know and love, maybe? Yeah, the regular expression. Tolerate. We love and tolerate. We know and tolerate. And actually, I like the form because it's more expressive than regular expression, which is like...
Starting point is 00:17:22 Right. They're very terse, aren't they? And it's nice if I can just imagine seeing one of these and knowing that one of the things that you could put in this are comments inside your regular expression in C-style comments. So at least explain what on earth is going on, which is harder to do, at least with the standard regular expressions. Yes, exactly. And then I was thinking how to convert a string into this form.
Starting point is 00:17:47 And that's simple. I love one parser. Simple. You've just said simple. You glossed over something which is very challenging. Like the only data structure you have in template metaprogramming is a stack. Okay. You can have a type list, and you can add a type in front and remove type from front. That's all you have. So this is kind of very Lisp-like primitives that you have access to, right?
Starting point is 00:18:17 But when you have a stack, you're actually Turing complete, and you can do whatever you want. Wow. I don't think I registered that that was the only thing you really needed in order to be able to do this kind of stuff. I mean, for Turing complete, you can do whatever you want. Wow. I don't think I registered that that was the only thing you really needed in order to be able to do this kind of stuff. I mean, for Turing complete, you need to have two stacks.
Starting point is 00:18:30 Okay. Right. Right. And just to be clear, the thing that you're sort of stacking here is that you're making a new type, which is specialized in different ways, and you're concatenating onto a type. So this is all in the type system of the compiler. There is no actual stack that you're manipulating that's going to be in your runtime program. It's all in the compiler at compile time. Yeah, it's a type. And if you are a type representing your current state in a parser,
Starting point is 00:18:59 and then you are a type representing one character on input. Right. And it modifies to another type, which represent different state of parser. And then you apply this repeatedly in a recursion way in a function. Yep. And at the end, it will return you a result,
Starting point is 00:19:18 which will give you only this is a regular expression, this is not a regular expression. That's like step one. OK. So let me just try to run that past you again so you build you're parsing this thing and you're building this type up and it's sort of pushing and popping as you're yeah following the lalr parser rules and at the end it either succeeds and you get back do you get back a bool or do you just get back a type that is the tag type of saying this is a valid regular expression or this is not a valid regular expression you can fail in multiple fashion you
Starting point is 00:19:48 can fail in some horrible expert like error in compiler also you can propagate uh boolean that the regular expression is uh wrong and then you can have only one static assert checking the boolean and with a nice error uh this is wrong, this is an incorrect error expression. I see. So static assert is an assertion that runs at compile time, right? It's just either true or false, and there's a compiler error if whatever you pass to the static error is not a constant known at compile time and
Starting point is 00:20:16 true. Yeah, it must be known at compile time. Right. And then next step, when you have a parser... Okay, so this is just the beginnings. And then next step is to have some context as a third parameter during parsing. Another stack where you, based on the symbol in the grammar, do some modification on the stack. Okay, so you've got one that's just like the parse stack, which you were using just to keep track of where you are, how many open parens and closed parens that kind of thing and the other one is kind of where you're
Starting point is 00:20:48 putting i'm guessing this is going to be like where the actual sort of compiled regular expression is gonna for like typical expression grammar you can have like uh number and number and then uh action uh plus came and you take two items from top of the stack and wrap them into number. Into plus operation. And the plus operation is actually, you can easily map to the first stage like the regular expression as an expression. So you have like concatenation, so there is an operator concatenation, then you just take two elements from the top of the stack, wrap them, and place them back to the stack. Okay, so it's like a stack machine that you would write,
Starting point is 00:21:33 even if you're writing a very simple math parser type thing, push two, push two, add. Or if you've ever used a TI calculator. Or a Texas instrument calculator.2, add. Or if you've ever used a TI calculator. Or a Texas instrument calculator. Yeah, exactly. But instead, multiplying numbers from input, you actually store the operation with the numbers itself. Okay.
Starting point is 00:21:57 So you kind of just keep it as that. You don't actually add them. You say this is an addition operation with one and one or whatever the equivalent is. Okay. Yeah, the input is actually uh from the grammar the lll one parser will give you give it to you as a stream of operations in postfix notation okay yep so then it's just two atoms concatenation atom concatenation atom concatenation and and then you have another sub-expression, and then at the end, there is multiple options or and.
Starting point is 00:22:29 Right. And then you will have the regular expression as a tree, as a type. Got it. And then you can easily run, because you just call the operator evaluate over the type, and it will go recursively in all branches and at the end it will give you true or false if the input matches this required expression so this time at this point we have transitioned from compile time to run time right so that the yes again i'm just going to write read this back to you to make sure i've got this straight because my head is spinning but
Starting point is 00:23:01 the the result of the pars is accumulated into this the second stack that you're talking about which effectively ends up being a bespoke type who is only responsibility in the entire world is to hold the compiled operations that will match the regular expression that was given right so if i say a or b it's a tree representation of the a and the b and the or or something something ish like that and then that type i can now at runtime give a string and it will do the matching of a or b in this instance uh that's amazing because presumably that means if i make a mistake i think well as you just said if i make a mistake in my regular expression, I don't have to wait till runtime for the thing to throw an exception saying,
Starting point is 00:23:48 oh, by the way, that's not valid. It happens at compile time. But that's actually the first phase. Right, that's the bit we were just talking about before with the Boolean or the tag type, right? Yeah. And then, so what implications does this have for, and what's the difference between, say,
Starting point is 00:24:04 just using a normal normal regular expression your compiler knows everything you are doing and it can inline everything and it's actually going down to assembly like compare compare compare compare red so it really does net out all this sort of very complicated sounding sounding highfalutin type based stuff once it's all boiled down to uh like actual code generation the compiler just essentially writes the handwritten code that you would have done yourself if you were said okay this needs to match either a or b yeah and it's just the two compares and an or or whatever and we're done wow my like first approach because, because I wanted regular expressions in a really performant one,
Starting point is 00:24:51 I was thinking about using LLVM to write the code and use LLVM optimizer. But LLVM is really hard to understand and get into. So I knew C++, so I write it in c++ kind of and then optimizer will do it for me anyway that's so you still get to use llv if you're compiling with clang you really are using llvm it's just you're using it through the medium of normal c++ uh as opposed to like you were saying if you were to you could useVM directly yourself, in which case you'd build up nodes and you'd then build a DAG and you'd build the comparison operations
Starting point is 00:25:29 and the basic blocks that are doing the compares and whatever. And then you'd say to LLVM, hey, can you generate me the assembly for that? Which would allow you to do some of that at runtime, I suppose, which is the only advantage that you might need if you actually wanted to have dynamic regular expressions,
Starting point is 00:25:43 which is usually a security risk, right? If you let somebody type in a regular expression, you're probably doing it wrong, like runtime ways. I think even evaluating a regular expression at runtime is a security risk. Right, because depending on the regular expression, you could have some absolutely catastrophic backtracking stuff going on where it could just blow out this... Not just backtracking, you can just run out of stack. Right.
Starting point is 00:26:09 Right. I suppose so. It's potentially unbounded. Right. I don't think I thought about that. That's actually mildly terrifying given that my website is entirely regular expression based in terms of all of the horrible things that it does. Mostly running at our clients. so it should be fine. Oh, yeah.
Starting point is 00:26:29 No, there's plenty of stuff that's going on on the server, unfortunately. Okay, good luck. But, you know, like, I don't think a regular expression overrun or stack smash is my biggest source of security risk, given that I just take arbitrary bits of user code and run them on my server so you know anyway that's uh also true yeah that's um yeah so that's amazing so compile time is is letting us push stuff that would traditionally be done with like a a static offline parser generator stuff and then a code generator like you said like llvm or even just writing your own stuff and being able to
Starting point is 00:27:05 hoist it into mostly sane looking c++ you know i have a very low threshold for template metaprogram like from user perspective it's just ctr match angle brackets and in the brackets it's there is a regular expression as you know it and so from a user's point of view it looks it's just a very slightly different way of looking at use like the angle brackets instead of the round brackets i think corinthian said that uh they are called hana brackets because of that but hana brackets oh that's what i'm going to call them um that's we've had all sorts of funny names of the things over the time like i i'll call them chevrons you know like the things that like are um um and then operator like this and no one can see this of course on the podcast but i'm doing like um two crocodile hands um facing one way is like the output operator and then the other way is the input operator and and then um
Starting point is 00:28:02 a good friend it's like alligator and this is crocodile something like that yeah i don't know how that works what do you call that i mean we've stopped using these except for their you know as nature intended to shift variables around uh you know bit shift stuff that's the right thing hopefully everyone's using fmt lib fmt this now nowadays for any kind of actual string manipulation um and then the only other one that comes to mind is something that a colleague of mine and ben's used to call um mummy and daddy duck and their baby ducks which is whenever you have to use double ampersands and then the dot dot dot in a template list it's you know amp amp dot dot dot
Starting point is 00:28:37 and it's like the the two mummy and daddy ducks and then the little little baby ducks and i think now i can't unsee that now every. It's not very often I can write code that needs that level of thing, but it's there. What do you call that? It's like a variadic R value. It's a perfect forwarding and variadic pack of variadic arguments. Gosh, that sounds like a tongue twister.
Starting point is 00:29:02 And I'm probably incorrect and probably someone from core will say to me, you should call it. Gosh, that sounds like a tongue twister. And I'm probably incorrect, and probably someone from core will say to me, you should call it unirazor references, you should call it rvalue references, or no. It's the double duck operator. Stop talking about it, double duck. Yeah.
Starting point is 00:29:19 It's the double duck operator. Double duck is the best. I think you're right. No, that's cool. So, I mean other other things i mean regular expressions are probably the um only the beginning of this right like the in terms of like very small grammars that make sense to be able to compile into a program you know we've all kind of realized that string matching is painful if you have to write it yourself and regular expressions for all the warts that we were joking about earlier
Starting point is 00:29:43 they can be a very efficient way of encoding something you want to get done and certainly ctre gives you a pragmatic way of saying here's a string matching thing that everyone understands and yet i won't don't have to pay any kind of runtime performance costs i get perfect assembly output that's going to match it as fast as anyone could possibly write handwritten stuff what else can we do yeah i was gonna say like does this general technique i mean obviously for regular expressions you've you've got this library but does this mean that you can do more things generally at compile time using this technique like you know i can imagine like you know uh a few situations in which you might want to have
Starting point is 00:30:26 other things calculated at compile time and be like, oh, it'd be super cool if I could have a regular expression here, but I can't because it's too early in the process. And I mean, I don't know, that's sort of just my naive impression, but does this open the door to more of those kinds of things? You can calculate anything you want. If you don't have any input-output and we are working on inputs too
Starting point is 00:30:48 so you can actually open a file, read it and then do some transformation and then store it in your binary or generate code based on it. It's from Jean-Hit Embed. The stood embed thing, right. Oh, okay.
Starting point is 00:31:04 This is long, you know, we've always wanted to have one has always wanted to have like a uh a configuration file kind of thing where you know you want to have something which isn't code necessarily but does go go into the code that's a brilliant way of doing it i can just imagine you know json parsers yaml parsers things being able to generate bespoke types potentially i mean template based types i guess not not yamo please not yamo why i like yamo what's wrong with yamo hannah you need to write parser for it and then i will ask again okay all right subset of yamo i get that the the ampersand thing that lets you refer to other parts is probably very very tricky but maybe there's something else i'm not thinking of
Starting point is 00:31:41 yeah all right but i I don't like Jason. Can I just say it out loud? It's not very fashionable. I know Jason is like the lingua franca of everything. But like, I can't put comments in Jason. And that makes it really hard for me to write anything where I'm explaining why some, you know, like package.json for like a node project. Right. You have to have a package.json.
Starting point is 00:32:02 It has to be valid.json. Not only does it have to be exactly valid JSON, like I can't have a trailing comma and I have to always quote everything all the time. I can't explain that I bumped a package version. Please don't do, you know, this is why this is here. I don't directly, depending on this package, I want package, but I need version three.
Starting point is 00:32:16 Otherwise I get a security warning. And I can't put a comment on the line saying, I don't care about this package, but it needs to be three or greater or something like that. And there's just written everywhere. And I know I used to have a trick where I would always just put an extra key of like under, under, comment, under, under, quote,
Starting point is 00:32:31 and then I could put a comment in the middle of a dictionary and just hope that no one cared that there was the thing. Anyway, rant about JSON over. I think actually like parsing JSON is like 200 lines of code in c++ i recently write a bright one uh so that i mean i've heard i know you know um ben ben dean and ironically of course jason turner um did a presentation a long while ago on on parsing jason and i kind of like it feels like that's great and it shows the technique off but i I would never want to do that because I hate JSON so much. Well, I like Jason Turner, just to be clear,
Starting point is 00:33:08 if you listen to this. Jason the person, not Jason the file format. Yeah, right. No, I mean, so I'm wondering what other things. I mean, could one write a C compiler? What? I think someone did already. Get a few years ago there was some someone on reddit uh with context per c compiler that's absolutely daft i love it i mean i've seen
Starting point is 00:33:37 people with you know the brain language brain f language that i won't swear on on our podcast but that one i've seen that that kind of makes. I can see how that could go together, even with my primitive understanding of template trickery. But actually, it's not template metaprogramming. It's actually Consexpr. Oh, so now we... Yeah. Yeah, go on. Tell us about... So what's the difference between template and Consexpr?
Starting point is 00:33:58 Yeah, template metaprogramming is actually abusing the properties of the language. They came with templates to make it easier to write some containers, etc. And then someone found they are actually Turing-complete. I see. So that was never intentional. No, that's never intentional. My naive thing of like, I want to write an array of or a vector,
Starting point is 00:34:23 an expanding array of type T, that's what I think of templates right and you're like well that's not turing complete that's just like yeah fill in the dots with the t in all the places yeah right then there is a contextual programming which is just really bad name uh for this function uh should or may be in compile time and then you are writing just ordinary C++. So you can actually write really easy code, really simple code. And in most cases, you can just add a context modifier to all your function and it will work anyway.
Starting point is 00:34:57 So if you implement that C compiler, just mark everything context pro and it should probably work. Wow. So that's i mean so that's where we've come along from so what use was a kind of amusing hack that was discovered by folks back in the day of like hey this is accidentally turing complete which i mean i think so many things in life end up being accidentally turing complete right um to constexpr being sort of brought into language as a way of tagging
Starting point is 00:35:25 just normal functions and if i remember rightly the first few editions of languages the language that had constexpr it was super limited you could only like have a single return statement they were afraid of that right you know you they were afraid of this and so you could write your constexpr multiply two numbers together which was you know constexpr int multiply int a and b return a star b that was all you could do. And you go, oh, there you are. That's really exciting. I was like, well, what does that help me here?
Starting point is 00:35:47 You could do recursion, but you couldn't do conditions. You couldn't do any cycles. Then they allow it, and they allow it more. In C++20, we allow allocations. So you can actually allocate, you can use a set vector now. In constexpr time. So this is just a normal C++ program that's running in the compiler effectively, and the output of it can then be used in things that are in the compiled program. Wow. You actually cannot return set vector from the function to out from the compile
Starting point is 00:36:26 time because there is like a problem that if you allocate something in compile time, but it will live in a runtime. I see. So there is like the distinction. So yeah, dynamic allocation that you did during compile time can't live on into the lifetime of the program because it's not dynamic anymore, right? How do you delete the memory that was like, well, it was in the compiler. Yeah. All your allocation must be freed when you are leaving the constant evolution context
Starting point is 00:36:49 right but you can use it inside of your function to make the algorithms much easier simple than just using templates and tricks you can actually implement your own stack with vector with actual normal code rather than like consing on the side of some lisp like thing that's getting bigger and bigger and bigger and so you're no longer abusing the type system to hold the internal state of your algorithm you can just use freaking variables right like like a normal person would do that's so cool exactly and then you can implement the parser in much more ordinary way and it will be much faster and i mean not like one or one magnitude but like exponential versus linear and this is what
Starting point is 00:37:34 runtime speed or you're talking compile time speed so the compiler itself is much better at i don't know understanding the code that it normally reads all day every day to compile our code than it is dealing with the abuse of the template system and all the instantiations that requires. Yeah, for example, if you have like a regular expression which is a few kilobytes long, which is crazy, by the way. I've seen the ones for like the match email addresses and they look a bit like that.
Starting point is 00:38:00 Yeah, and if you try to compile it with a CTRE, I don't know if it will ever end and it uh usually it crashes on uh insufficient memory right i can imagine because of all but then i uh made a prototype of ctree using context-per-parsing which is like limiting also the template instantiation and then i tried a few megabytes long cargo expression and it took like half second or something to compile that's such a big deal that is huge yeah like we've talked a lot ben and i one of the things that the themes of this uh of this podcast is like my impatience at waiting for code to build and and Ben's impatient waiting for tests
Starting point is 00:38:46 to run right that's our like our shtick here that's kind of true yeah and so having tools in our toolbox that allow us to use these advanced techniques that are coming down the pipeline or even here already that don't also suck compile time is's very important to be close to my heart. So I'm really excited to hear these. That's amazing. And you can actually easily test your code. I was going to ask. Yeah, exactly. And your code won't ever compile if there is any problem.
Starting point is 00:39:19 You can just run it in compile time and place it into static assert, like require in test. And if you are okay, it will compile. If not, your compilation will fail. So the compiler runs the test implicitly as it's compiling. That sounds like Ben's heaven.
Starting point is 00:39:36 Yes. So the answer to the question of should I check things with the compiler or should I check things with test is yes. Yes, both. All of the above. I mean, that's great. But actually, you should test both.
Starting point is 00:39:52 In C++20, you can actually have different code which will be run in compile time and in run time in the same function. Oh, wow. Yeah, this is something I wanted to talk about, actually. That is one of the things about constexpr is that any code you write that's constexpr can also be run dynamically. So, for example, if you call a function like my multiply routine that I just sort of said, and you need it at compile time, the compiler will obviously evaluate it with all the constants available and promise you that, like, you know, you've got an array of size 10 because you did two times five, right? End of story.
Starting point is 00:40:25 But if you call it with dynamic parameters, it will just generate the code and do it like normal. So therein, and then Hannah's just reminded me or reminded us that you can now ask the compiler, am I being evaluated inside the compiler or am I being evaluated at runtime? Like, am I in the matrix or not? And then you can change the answer,
Starting point is 00:40:44 which of course, is terrifying. And you shouldn't do that. You shouldn't do that. You shouldn't do that. So when might... All right, you said you shouldn't do it, but why would you need to know? Yeah, sometimes it's really hard to implement something in context-perlimitation. It's usually recursive algorithms.
Starting point is 00:41:03 But in the runtime, you can use some intrinsic, for i see or inline assembly i see so like for example if i had a matrix multiply routine four by four matrix multiply routine then of course i want to be able to do that at constexpr because who knows i might be rotating some points or something that's in a big list of things but if i'm running on the real-time system i actually want to use the the underlying x86 architectural instructions myself to do it. So you need to be able to switch on those. So you can have two paths, one for compiler and one for your runtime.
Starting point is 00:41:33 So back to your point that you made, which is that having the compiler run your test is fine, provided you don't have any code that differs. But if you do have code that differs, you should also test it with the normal techniques techniques which brings me on to the next question i have actually about this which is how do you debug this because i'm used to putting breakpoints and printfs everywhere and i don't think the compiler lets me do that or maybe it does yeah there is no tooling for that and uh like one trick i found is to create a template which is not defined. It's only declared, if I'm correct. Okay, so you just say, there exists a thing called Bob, but you don't give a body.
Starting point is 00:42:12 So it's like the compiler can't do anything with Bob if you try to do... It's actually a template type name, something struct Bob. Okay, so Bob is... Yeah, sorry, this is terrible now. I use Bob for anything, or Ian is the other one. But Bob is of T, right? And you know, so you've got a T that you can give it. Okay.
Starting point is 00:42:26 And if you try to instantiate it somewhere, like it's like print for debugging, it will fail in compilation, and it will tell you there is no instance of Bob T. Print for debugging of, and then whatever you put inside. So, yeah, you can do your uh yeah you can throw an exception kind of thing with a with a helpful message but the helpful message unfortunately
Starting point is 00:42:51 is just a type name inside a deliberately left out um bodied this is like good for debugging but uh for uh like when i wrote ctre i start from the parser and I wrote a lot of tests to make sure it's working properly and then I built on that. So when I change something, it will immediately break so I know something is not right and it helps. So obviously test helps. Well, that's very on brand for us. So I'm glad to hear that.
Starting point is 00:43:24 Even when the compiler is doing all the work for you, testing is useful. Yeah, you're actually writing a code. So you need to test it. If you want to test it, then it's meaningless. Absolutely. So how long until we have a template programming language testing framework? Actually I think catch2 already has some macros which will force evaluating something in compile time to check it. So there are definitely already some tools. But in CTRE I was lazy and my tests are actually a bunch of static asserts. And if any of the CPP file fails to compile, then there is a problem.
Starting point is 00:44:09 Right, right. And every time I get some issue or mostly issues, then I try to make it happen, the problem, and then I fix it. Right. If I try to fix it before that i don't know what i fixed yeah that never works out gotta watch it fail first yeah yeah start from a failing test you know and then work backwards otherwise yeah you so actually it's exactly same as normal programming right exactly just like aqua yeah yeah i guess i've gotten so lazy with debuggers like actual
Starting point is 00:44:42 like gdb and whatever that i'm so, you know, I'll happily write something without testing it as much as perhaps I ought to, or at least certainly if I'm trying to understand someone else's code, you can really do far wrong than single stepping through and kind of going and getting the feel of it. And I feel like there were perhaps this tooling
Starting point is 00:44:58 for that that would be useful. Can I? I think you can always hook into compiler with GDB. That's true um if you can understand what's going on inside gdb using sorry inside gcc using gdb i mean i wouldn't surprise me that you can understand that hana frankly no no i definitely can't oh i have to ask by the way so there there is a library called hana already that's a template metaprogramming library. Who was first?
Starting point is 00:45:27 Who stole it? Were you before the library or were you named after the library? I was born before the library. And then they stole your name. No, no, no, no. There is no connection at all. I find that hard to believe. I mean, there's certainly some normative determinism going on here.
Starting point is 00:45:42 Surely you're now part of it. It took me a while. It was like, I think it was a CPK, I did my lightning talk about CTRE, and people were like talking, like, you must know Louis. Oh, yes, right. It's Louis, right? Yeah, yeah. Yeah, yeah.
Starting point is 00:46:00 Out of Wustana. And no, what's going on? I didn't know. So you didn't know that there is like, yeah, out of Boost HANA. And no, what's going on? I didn't know. So you didn't know that there is a big compile time sort of pre-procedure. Actually, I don't even know what HANA is, I'll be honest with you. I know you. It's like a style of using templates, but without you actually knowing that you are using templates.
Starting point is 00:46:23 It's using function argument deduction tricks. Okay. And actually last week we had, no, it might be this week, I don't know, I'm not sure anymore. You have no idea what time it is or what day yet. We had a study group seven meeting and we were discussing future of reflection and we were reminding people that boost HANA style
Starting point is 00:46:49 or HANA style function were rejected. Right. And there's you saying this, right? No, it was David actually. It's funny. But no, literally no connection. That's funny. It is. Maybe connection that's funny it is it's amusing maybe it was prophecy maybe it was prophecy that well take it as prophecy that seems like a great great uh point
Starting point is 00:47:12 to end on here as we've got to time but um thank you so much for being with us today this has been really really interesting and i think i've finally gotten i mean so full disclosure before now i've been to several of honda's presentations on this kind of stuff and I've never really had it stick in my head the way, even when... No, no, no. Matt was my remote clicker. So before it was fashionable to do presentations remotely, Hanna was ahead of the curve.
Starting point is 00:47:38 I forget, what was the reasoning you... My flight was cancelled. I couldn't go to the conference. So I did my presentation remotely, but something failed with remote control of his computer because I wanted to run slides on Matt's computer because there are animations and everything. I was supposed to control it with VNC, but somehow it fails.
Starting point is 00:48:04 So Matt was my clicker. So I was supposed to control it with VNC, but somehow it fails. So Matt was my clicker. So I was telling him, like, next slide, next slide, next slide. And then there was, like, the audience, and there was the slide with the expression tree. And I asked Matt, click on the text box and write some expression. So I got to do the cool thing with the presentation and take all the glory and then poor hano did all the work no no that was that was fun oh man that was fun but and that
Starting point is 00:48:34 was actually funny conference and yeah yeah it's my clicker that's that's my i peaked i peaked. I've peaked in my career now. I've been Hannah's clicker. And I was very proud to be. All right, friends. Well, once again, Hannah, thank you so much for your time and for explaining this all to me. As I say, I think I have a much better understanding now than I think I've had before.
Starting point is 00:48:57 And thank you for inviting me. Yeah, this is great. Super interesting. You've been listening to Two's Compliment, a programming podcast by Ben Rady and Matt Godbolt. Find the show transcript and notes at twoscompliment.org. Contact us on Twitter at twoscp. That's at T-W-O-S-C-P.
Starting point is 00:00:00 Theme music by Inverse Phase. Inversephase.com. cp

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.