CppCast - Programming History, JIT Compilations and Generic Algorithms
Episode Date: October 23, 2020Rob and Jason are joined by Ben Deane from Quantlab. They first discuss the 11.0 update of Clang and an a blog post highlighting some of the smaller features that were added in C++17. They then talk t...o Ben about some of his recent CppCon talks including one on what we can learn from the history of programming languages and another on the ability to JIT C++ code. News Clang 11.0.0 is out 17 Smaller but Handy C++17 Features Links Careers at Quantlab Constructing Generic Algorithms: Principles and Practice - Ben Deane - CppCon 2020 Just-in-Time Compilation: The Next Big Thing? - Ben Deane & Kris Jusiak - CppCon 2020 How We Used To Be - Ben Deane - CppCon 2020 Sponsors PVS-Studio. Write #cppcast in the message field on the download page and get one month license PVS-Studio is now in Compiler Explorer! Free PVS-Studio for Students and Teachers Use code JetBrainsForCppCast during checkout at JetBrains.com for a 25% discount
Transcript
Discussion (0)
Episode 270 of CppCast with guest Ben Dean, recorded October 21st, 2020.
Sponsor of this episode of CppCast is the PVS Studio team.
The team promotes regular usage of static code analysis and the PVS Studio Static Analysis Tool.
And by JetBrains, the maker of smart IDEs and tools like IntelliJ, PyCharm, and ReSharper.
To help you become a C++ guru, they've got CLion, an Intelligent IDE,
and ReSharper C++,
a smart extension from Visual Studio.
Exclusively for CppCast,
JetBrains is offering a 25% discount
on yearly individual licenses
on both of these C++ tools,
which applies to new purchases and renewals alike.
Use the coupon code JETBRAINS for CppCast
during checkout at JetBrains.com
to take advantage of this deal.
In this episode, we talk about an update to Clang and smaller features from C++17.
Then we talk to Ben Dean from QuantLab.
Ben talks to us about just-in-time podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
I'm all right, Rob.
You know, on the topic of being the first podcast,
CBPCat actually recorded an episode yesterday, and I was in on that one.
Oh, what was the topic on that?
Rust and C++ Roundtable was the title.
Did you do that with your cousin as well?
No, he did not. He was not on it. It was other people, including with your cousin as well no he did not he was not on it
it was other people including uh nicole masuka as well actually who we know in the c++ community
here right how'd that go uh pretty good i don't know i wasn't watching it so much as being on it
but uh then it sounds like watched it who i'm giving a sneak peek here we can ask him what he thought later i did watch it okay well before we get to ben uh the top episode i'd like to read a piece of
feedback uh we got this tweet this week from uh i can't pronounce this name but the twitter handle
is claim.red and they said awesome episode build system related casts are definitely my jam
basil's worldview looks really similar to fast builds shared cash vendorization etc fast build
we've heard of fast build before right yes i think so i don't think we've talked about them in any
detail but uh yeah i'm glad you enjoyed uh the talk on on basil it was a good one you look
concerned jason oh i was just wondering why we had heard of fast build before i was trying to Yeah, I'm glad you enjoyed the talk on Bazel. It was a good one. You look concerned, Jason.
Oh, I was just wondering why we had heard of FastBuild before.
I was trying to remember who owned that or whatever.
I don't know.
Anyhow.
Well, we'd love to hear your thoughts about the show.
You can always reach out to us on Facebook, Twitter, or email us at feedback.cpcast.com.
And don't forget to leave us a review on iTunes or subscribe on YouTube.
Joining us today is Ben Dean.
C++ wasn't even among the first 10 languages that Ben learned on his programming journey,
but it's been the one that has paid the bills for the last 20-odd years.
He spent most of that time in the games industry.
Many of the games he worked on used to be fondly remembered,
but now he's accepted are probably mostly forgotten.
These days, he works in the finance industry,
writing high-frequency trading platforms,
and the most modern C++ that compilers can support.
In his spare time, he watches a lot of YouTube's educational sector,
practices the Japanese art of sundoku,
reads about the history of programming, avoids doing DIY,
and surprises his wife by waking in the middle of the night
yelling, of course, it's a monad,
before going back to sleep and dreaming of algorithms.
Ben, welcome to the show.
Thanks very much for having
me all right i've got like two questions from this bio i think i wrote the bio to to to tease
you there jason a bit i well okay so maybe i'll take a slightly different uh point um
tact than you were hoping for but how many of your first 10 languages were dialects of BASIC? I'm going to say maybe two or three.
So I started out with Sinclair BASIC on the spectrum.
Then, of course, BBC BASIC and Quick BASIC or QBASIC.
I actually list QBASIC and Quick BASIC separately, personally.
Yeah.
I think there are very minor differences, right?
It's pretty minor, yeah.
Yeah, so three basics, if you like.
But yeah, before C++, there was, let's see, Logo, ML, Pascal, Fortran 77, Lisp, Modular 3, maybe a couple other things.
These were various courses that I had through about second year of university.
Okay.
And when I first went to work, I was writing in C.
We weren't allowed to use C++.
It was very, you know, game developers are very skeptical about new technology,
and C++ compilers weren't nearly as advanced as they are today.
And so C was the thing that everyone trusted.
Okay, now at this moment,
I feel that I must go completely off script
and completely outside of your bio here
and do something that's kind of more interview-ish,
if you don't mind.
Rob, I hope you don't mind.
Go ahead.
But it came up in that Rust C++ roundtable,
and I've heard this conversation,
I've heard this argument many times from many
different people, saying that one of the main problems with C++ is that each company only
allows a subset of the language, so basically everyone's programming in a different subset of
C++, because, you know, whatever, and a lot of the newer features can't even be used because companies don't allow them.
Now, I've heard this comment from many people, but I've never actually seen it happen,
not in any of the companies that I've ever gone teach at or in any of the jobs that I've had personally. So I'm kind of curious, Ben and Rob, since you're both here, if you have seen this,
where you're like, oh, no, my company doesn't allow, and if you're saying my company doesn't allow raw new because we use make unique, that's not what I'm talking about, right?
I'm talking about like my company doesn't allow operator overloads or whatever.
Yeah.
Yeah.
Exceptions is kind of the obvious thing to mention here.
Okay.
Many people who work in games and embedded in systems like that don't use exceptions
at all. There are enough exceptions
as a compiler flag.
But other than that,
yes, in particular
this was visible to me around
the time of adoption of C++11 and
14.
So working in the games industry at that time,
everyone
was sort of waiting to see what the new standards would be.
And there were so many features that came in with 11.
It took us a while to, it took people a while to get to know them.
Some were very quickly adopted.
Some were universally a good thing.
Scope denums is in that category.
Like game developers just loved scope denums more or less from the get-go.
They were obviously...
Anything where it's obviously a benefit with absolutely no performance impact,
like that, great.
Things that take a bit more getting used to, like lambdas,
I think it's now pretty much accepted that lambdas,
at least shortish, let's say, Lambdas, are a good thing, right?
In game development, I might say.
Although I haven't been in game development for a couple of years.
But it took a while for people to accept Lambdas.
It took a while for people to be educated about Lambdas.
In particular because I think Stood Function came in at the same time, I think.
And that confuses a lot of people.
And that really doesn't have the same
performance profile as Lambdas.
So people
were confused about those two for
a couple of years back there in the
mid-teens.
And so, yeah, more it came
down to, at least the company I was working
at at the time, the individual game teams, the technical directors and the, at least the company I was working at the time,
the individual game teams, the technical directors and the lead engineers of the individual game teams would get together and decide which features they wanted to open up, as it were,
and which features they definitely didn't want used in their code base.
Okay. Well, that answers my question.
Because, like I said, I've never really seen that.
Every now and then when teaching, I'll have someone have someone say oh we're not allowed to do x and i'll be like well you're wrong and let
me show you why and then we move on but it's it's it tends to be like small minor things and well
and i think quite a few companies adopt you know the the big third party coding standards they're
out there like google's coding standard and um standard. There are some prohibitions in there.
Oh, yes. It's been modified over
the years. Yeah, the early versions of that
were terrible.
The early versions of it were absolutely terrible
because it disallowed basically everything that makes
C++ good.
You effectively couldn't use
RAII, effectively.
You couldn't do any kind of operator
overloading. You couldn't do anything
that makes code readable, basically. It forced you into a corner of fancy C, unreadable code.
Okay. I haven't looked at it lately, but do they allow... I imagine something they still
maybe don't allow is user-defined literals because of the scoping issues and the maintenance issues
around that. I don't think Titus is a big fan.
That might be true.
I also know Chandler's not a big fan of them.
So that,
uh,
yeah.
Anyhow.
Okay.
Well,
that was quite the diversion from your bio,
but thank you.
Great.
Yeah.
Okay.
Well,
Ben,
we got a couple of news articles to discuss.
Uh,
feel free to talk and comment on any of these,
and then we'll start talking more about what you've been up to lately.
Okay.
All right.
So this first one is Clang 11.0 has been released and you've got a big
change log here.
Some of the things I thought were interesting was they improved the AST for
when you get some broken code so they can better
diagnose it it's always good to have you know more expressive errors uh anything else you
want to point out jason well to me the most notable thing is that they didn't add any c++
features they have one bullet point of c++ language changes but yeah it's pretty minor
yes if you and if and i'm like wait am i missing something
so i went to the cbp reference chart of c++ 20 features in clang sure enough c++ clang 11 doesn't
doesn't add any new c++ 20 features to it i was just surprised i mean i'm not on the clang
development team obviously but it could be that they're not counting or they're not counting
sort of improvements or bug fixes to existing features or sort of um you know ongoing work on
for example you know concepts is a feature that's been added recently not in the oven but it's late
before but but you know i'm sure concepts has had a few bugs and and you know they're sort of working
on those in general.
But it's not called Blout as a new feature per se.
That's a conversation you and I had relatively recently about some concepts, differences between Clang and GCC.
Yes, yeah.
It'd be interesting to see if they more agree with each other now or not.
Okay. Anything else we want to call out with this?
I guess at this point, we can go more into new C++ features.
I do find it good and bad that Clang format is still constantly being developed.
Because there's things that Clang format doesn't do right.
Sometimes it gets confused around lambdas and such.
And one of the things that they have in here is lambda things.
But if you are on a team that relies on Clang format and everyone does a Clang format before
they commit their work, then sometimes you end up in a scenario where people don't have quite
the same version of Clang format installed. And then you're going like back and forth on one line
of code or something like that. There's no way around it. It's just a tiny, mild frustration for me.
Yeah, I am on the team that does that.
We use Clang format.
We have a pre-commit hook that,
I think it's a pre-commit hook,
that enforces it.
And we have to make sure everyone's
got the same version, I guess.
Yeah.
I think, were you thinking of
the new familiar template syntax lambdas
when you were saying that Clang format doesn't quite get lambdas right in some cases?
I don't know if it's still...
I know there was a time when there were cases there where it didn't get them right.
Or it didn't, I mean, obviously.
It didn't get them as beautiful as one might expect, I guess.
They're still great.
And there's something else that I have in my current code base
that I'm experimenting with, and I don't remember what it is.
I'd have to go look.
It's not worth the time at the moment.
But clang-format is like not adding a space where it's supposed to.
It's obvious.
This is a comparison, and it has a space on the left side of the comparison but not on the right side in one particular context and i'm like
what are you doing yes i've encountered things like also with a requires clause
yeah it was if it's just like requires single term like x whatever um it seemed to form it
slightly differently than if it's requires and then you have something in parens.
It takes away the space between the requires keyword and the paren.
I think I've run into that.
That might be related.
But as a whole, I didn't used to use Clang format when I worked in the games industry.
There was some push to use it.
In fact, I was sort of trying to convince people to use it,
but getting 200 programmers to agree,
we didn't quite get there in my time.
But coming into the finance industry,
I just came onto a team at QuantLab
where we just, Clang Format's just a thing.
It was already there.
And so it was just a case of, all right,
now I just don't, you know,
I don't worry about formatting anymore.
It's just something that happens on safe.
And for me, I'd like...
Yeah, if it comes up weird, then I just...
I'm not worried. I know it's still correct.
So I just sort of think, oh, well, that's just how it is right now.
Right.
Fine.
I think on open source projects, too,
it lowers the barrier to entry if you can say...
Instead of having to argue with
the person saying your code is terribly formatted just say just please run clang format before and
i'll review your commit after that or whatever yeah i often find it useful i often when i get a
error message that is very templated um i'll just copy paste that error into a C++ buffer in my editor
and Clang format it.
And of course,
then it's much, much easier to read.
That's a neat idea.
Yeah, I've never heard of that.
Okay, next thing we have
is a blog post on a BarTex coding blog.
And this is 17 smaller but handy C++ 17 features.
So features that we probably haven't spent too much
of any time talking about,
but are nonetheless useful features.
And he's got a list of 17 here.
Any specific ones you wanted to mention, Jason?
I know one of the ones that I liked
was removing auto pointer,
just no longer having to see that in the code, not having to worry about it anymore is nice.
Removing old functional stuff recently came up when I was trying to get an older code base
compiling with C++17 Visual Studio, because Visual Studio is the only one that actually
removes these features, just for the record. Oh, find on oh yes oh okay they don't want to break
other people's code basically right right um there's one here that has to me like this huge
implication and i hadn't even been aware of it before try in place uh try and place in this example in Vartek's blog, try and place, he passes an object by move
and to try and place. And this becomes a maybe move because if the try and place doesn't succeed
because an object of that key is already in the map, then it doesn't move that value. Now there's a fairly good push, then C++20 had one change for
this to make move more automatic in more places. And there's another push to make move more
automatic in more places where if you pass something by rvalue reference, it's just going
to be moved. So we don't have to type std move everywhere inside our functions. Well, here is
an example in the standard library of something
that's a maybe move. And if we make move more automatic and more places, that becomes tricky,
potentially. In this case, the example shows moving a string as what will be the mapped type
in the map, right? Yes. The signature of try and place actually takes an arg pack
that will construct the mapped type in place, I believe.
So it takes a key type and then an arg pack,
which would be perfectly forwarded to construct the mapped type.
But only if that key doesn't already exist.
Yes, only if the key doesn't already exist.
So I think what you're saying still applies.
It's one of these cases where, you know,
std move doesn't move in the cast to r value,
and here it is writ large.
And I'm not saying I like the fact that std move doesn't move,
that it's just a cast, right?
I'm just saying whenever I bring this up,
that in cases you might have a maybe move,
people are like, oh, but, well, here's a case in the standard you might have a maybe move, people are like, oh, but well, here's a case in the standard library.
That's a maybe move.
Right.
One of the things here that I'm interested in is the variable templates for traits.
So in 17, the type traits got the underscore V versions for just the Boolean values.
I've run into this recently because, you know,
I do a lot of, on my team,
we do a lot of compile-time programming
and we use a lot of type traits.
And the interaction with concepts here is interesting
because, so normally you have like trait underscore T,
and that is usually a thing
either of type true type or false type, right?
And then you have a trait underscore V, which uses the value out of that trait underscore t typically um but there was
i think i remember some discussion in the community about whether whether the underscore v
should be should use the the value or just an instantiation of the true type and false type
which is convertible to a bool okay right so Right. So, and I've noticed that at least, and I suspect this is a standard thing,
because the compilers do this right now in concepts.
You can use bools in concepts, but you can't use true type or false type.
There's no conversion there.
It just doesn't happen in the context of the concept, I guess.
I haven't looked up chapter and verse in the standard for this,
but I found that when...
So the upshot is now all of my underscore V declarations that I use,
I make them use the value from inside the trait
rather than instantiating the trait's type as true type or false type.
And so that makes them usable in concepts.
And that is what the standard type traits do as well, right?
Their underscore Vs are just true or false.
Yeah.
Yeah.
They're just plain old balls.
Which actually makes them faster to compile then,
because instantiating an object of type true type or false type
actually has a teeny bit more compile time overhead.
But it can add up in a heavily templated code base.
I guess so.
One more I wanted to mention was has include preprocessor expression.
You can now basically have a preprocessor check
to see if you have an include header available.
And if you don't, have some different alternative code
to use something
without the dependency on that header and i just i know i've seen lots of older open source
libraries where they always have like you know different has whatever header pre-processor
defines that you have to define when you're building a library and you could get rid of all
that code using config.h kind of thing that has to set all these things up.
Yeah.
Yeah.
I would like to call...
Go ahead.
I was just going to say,
I would like to call myself out here and say,
I still don't understand how VoidT works,
even though the first time I watched a conference talk on it
was in like 2015.
I just, you know what?
I'm happy concepts are here.
Yeah, in most cases, it's probably super simple concepts, right?
So you kind of dodged it there.
It had its time.
It's like Visa local bus.
It was here for a while and then gone.
I was going to call out FromChars and also 2Chars,
but I found myself using FromChars a lot more.
I like it i i want i because i really just want the one one function which i don't want to have to look
at whether to do you know a to i or strutted d or whatever it might be from chars i just want
the one place where i can go and know that that's the fastest possible thing. Now, all the implementations aren't quite yet there, I think.
But, you know, for example, Microsoft's STL,
Stefan Laue, he did a great job, you know, doing
to-chars and from-chars. He had a talk about that, I think, last CppCon.
The one thing I don't like about it is it's not constexpr.
It should be.
It absolutely should be.
And I asked Stefan whether it would be easy to make it constexpr,
and he said it should be fairly easy
because it's mostly just plain old algorithmic stuff.
There's a couple of intrinsics in there,
but nothing that would hinder constexpr, according to him.
And in particular, with the new stuff with C++20
with more and more using string-like things
as compile time as non-type template parameters
and things like that,
I found myself needing from chars
in the constexpr context a lot.
And I'm usually just, you know,
I have to make do with just a very sort of naive
decimal integer only type loop that I have to write.
By any chance, do you just wrap that just for the fun of it and like,
and is constant evaluated and then call the more efficient thing
if you're not being constant evaluated at the time?
Actually, no, I think that's a good idea though.
Most times I need it. I know that it's going to happen at compile time lately. Right. But yeah, that's a good idea though. Most times I need it.
I know that it's going to happen at compile time lately.
But yeah, that's a good idea.
Something for me to think about.
And one of the things I'm playing with right now.
It's also, I think, really important to point out that from cars and to cars does not use locale.
And as far as I know, it is the only string number conversion things in the entire standard library that don't use the current locale,
which makes that significant if you care about,
oh, hypothetically, you have a scripting language
that needs to be able to parse floating point numbers,
and someone's using it in Europe,
and they have their locale actually not set to ENUS,
and then everything breaks.
I had to write my own number parsing.
Yes.
Okay.
Well, Ben, you had several CppCon talks
that I think we wanted to touch on a little bit today.
To start off, you did this one lightning talk
about older programming language books,
or one specific one, but it reminded me that uh connor hoaxer also did a similar talk about another like
50 60 year old programming language book um so what should we you know be learning from some of
these older programming language books what did they know that we uh you know think is still new to us? Well, I picked up this book because I like to read about the history of our field.
And the more I read about it, the more I find that all the things we think are new these days.
You know, there's a lot of new technology these days, and there are new things these days.
But all the sort of, as I said in the talks, all the elements of the programmer's condition have been the same for the last 50, 60 years, probably. And it's just fascinating to go
back and, you know, this book was printed in 1969. It's by Gene Samet. And it contains a large
number of languages. You know, you wouldn't necessarily think that there were that many languages extant
at the end of the 60s.
But there are, you know, as well as the ones we know these days
like Fortran and COBOL,
there was just really an explosion of languages
and they were very much more,
a lot of them were very much more special purpose
or fitted to their domain than languages are these days.
So you had languages for numeric computation, languages for string handling.
You know, all these different domains had separate languages.
We can see that a bit in the history of Unix as well, things like Ork and Zed and Enrof
and those kind of languages.
And they harken back to the day when there was this amazing fecundity of languages
and people trying out all these different things
before we sort of settled down
into what we think of today as a language,
which is sort of a general purpose,
sure and complete thing.
So your question was, what can we learn?
What was your question?
What should we take away from that, perhaps?
Just that, you know, it's a sense of history.
It's a sense that the same issues we have today, surprisingly, you know, you think of, you know,
are we worried about compile time or we're worried about performance?
They had all those same issues then.
And in fact, sometimes even in ways we don't
think of now, like compile time, compile time was a concern back then, not just because,
you know, it could take programmers time, but because you might, it was much more common to
be compiling something which would only run maybe once or twice, because it was the problem was very
specific at the, you know, and they didn't have general-purpose tooling
to build up things to solve problems.
They had to write programs specifically
to solve specific problems in some cases.
So the compile time could easily be...
Compiling is very expensive
compared to maybe solving the problem.
So the compile time could easily be 50% of the total time,
computer time, spent on the problem.
Whereas these days, know we we like
to minimize compile time but um i don't think most people run time outweighs compile time
and no matter almost no matter how long the compile takes almost right do you uh ever find
yourself since you do enjoy this history of programming languages and such, saying, you know, this one function right here
that I'm currently working on,
I wish I could write it in Fortran or Lisp or whatever.
E-Lisp.
In your case, maybe.
Well, sometimes.
I don't really ever wish I could write it in...
I mean, sometimes you run up against, you know,
in C++ some particular
odd or annoying
syntax.
C++ is pretty powerful, though, in terms of
what it can do.
It's not always beautiful or nice
to do things.
But yeah, it's more of a case of
not wishing for other languages,
but using...
And the reason to study all sorts of different languages
is to find out how
to differently do things, different ways of looking
at things
most obviously perhaps in the last decade
the influence of Haskell
has shown that in many languages
C++ included
different way of thinking about things, different way of expressing
things to look at
a computation differently that's i think where it comes through
okay um how else does uh you know your your knowledge of all you know these older programming
languages and the way they did things back in the 60s uh help you with your day-to-day work
uh to be honest mostly i'm i'm glad that i have modern tools available um you know because
it's all very well to study history but um it's it's it's kind of like going back and playing a
video game from your youth and thinking it's going to be great it's great for five minutes you know
and and then you're like okay i've remembered that i can go away now and best fondly remembered. Having said that, there are, you know, just a little bit sort of,
I guess now the 70s counts as a long time ago.
Yeah.
I think of it as a long time ago.
But oftentimes, you know, having a knowledge of things like Ork,
for example, AWK, the Ork language,
that comes in handy still many, many times,
like maybe a few times a month at least,
when you want to go through some data.
Although Python is pretty much the glue language
that a lot of people use these days for gluing things together
when you don't need that high performance.
Knowing things like Ork in particular is a great language to know
just for throwing things around on the command line
and chopping things up very easily.
And it's surprisingly difficult to beat performance-wise, actually.
Funnily enough, Unix tools,
which have been optimized over 40 years,
are pretty fast.
And for any given problem involving chopping up text, doing it in five minutes on the Unix command line gets you a solution that, even performance-wise, is hard to beat, even if you go to C.
Because, after all, it's all written in C and pretty up to my underhood. There was a talk I went to in,
there was a C++ meetup in the Bay Area talk I went to
a couple of years ago now,
where the speaker compared Unix tools
versus a C solution versus a C++ solution
to solve a problem like that on the Unix command line.
The takeaway I got from that talk was,
you know, just use Unix tools.
Just use sed, orc, gre use Unix tools, just use Z,
or grep, they're pretty much fast enough, and it would take you quite some effort to beat them.
You see, that's like the kind of thing that I was thinking of when I asked if you wished you
could just write this function and insert language here. I've never thought about this before,
but it does seem like it would be kind of handy if in the middle of my c++ code i could be like i need to do a quick regex replace on this string you know at
runtime if i could just put a sed string in there that would just be sweet yeah that that would be
good um you should work on that maybe i'm not sure but yeah uh the when i joined quant lab one of the when the interview test was
um write a little function to write write a little c c++ program to uh to go through some
some training data you know like do some do some stats crunching find averages min and max and
stuff like that from a csv file basically um and uh writing that in c
plus plus yeah you can do it it's not difficult it takes an hour or two you know uh and you end
up with a program that's fine it's probably you know a few hundred lines writing in orc i after i
joined the company i think about this time last year um i went back to the exercise and i did it in orc uh and it was 40 lines and 20 of those are because orc just is so primitive it doesn't have
things like min and max um and and you know it handily beats the performance of of of uh
certainly my my original solution and you know a couple of candidate solutions I tested it against.
Wow.
And it was five to ten minutes work.
Today's sponsor is the PVS Studio team.
The company develops the PVS Studio Static Code Analyzer
designed to detect errors in the code of programs
written in C, C++, C Sharp, and Java.
Recently, the team has released a new analyzer version.
In addition to working under Windows,
the C Sharp part of the analyzer
can also operate under Linux and macOS.
However, for C++ programmers,
it will be much more interesting to find out
that now you can experiment with the analyzer
in online mode on the godbolt.org website.
The project is called Compiler Explorer
and lets you easily try various compilers
and code analyzers.
This is an indispensable tool for studying the capabilities of compilers.
Besides that, it's a handy assistant when it comes to demonstration of code examples.
You'll find all the links in the description of this episode.
Okay, well, let's maybe move from history and older programming languages
to some new things that are happening in C++.
You also did a talk on JITs, right?
Yeah, Chris Jusiak and I put together a talk on just-in-time compilation
based on a paper by Hal Finkel from a few years ago.
Now, he's gone through some revisions.
The idea behind it is to, so we extend Clang to,
and there's a new attribute you can use on a function template,
and it's a Clang JIT attribute.
Okay.
And what that means is that the function can be,
the template can be instantiated at runtime through the just-in-time compilation mechanism.
And basically the mechanism for that is you build a string, literally a std string, and
you put the contents of that std string in as the template argument.
So your std string might literally be the string int, I-N-T.
And when that gets jitted, the function template is instantiated with a template
argument, int. But you can put
any string in there, so as long as that string
in some sense resolves to a
type, that gets
just-in-time compiled.
That template is instantiated at run time
at the point of use.
So that's the basic mechanism of it.
And it means, yes, it means,
now you're thinking, yes, it means that the whole compiler is based, or at least the whole clang,
yeah, the whole compiler is inside your application.
And so naively your application is quite large.
But there's a way, as Chris and I showed in the talk,
to once you've instantiated the template,
basically in memory then you have the IR, the AST,
and you can save out the IR, which has already been through the Clang frontend
and through the optimization passes, and then is ready for basically
very cheaply turning into an object file and linking.
So although your naive sort of program that is all single dancing
and allowing all of this JIT to happen is large,
you can end up having JITed the right things.
You can end up with a outputting, a relatively small binary that does exactly what you need.
But instead of being based on things you had to foresee ahead of time compile time. This JIT phase I think of as sort of a configuration time
or an install time setting. You know, it's distinct from ahead of time compile and it's
distinct from final runtime, but it's sort of a one usage of it is sort of as a configuration
time where you're like, okay, I know something about my data now. I can use that to output
exactly the optimized binary that I need to crunch my data. But is it only types, parameter,
template parameter types, or can you also do non-type template parameters? You can also do
non-type template. Yeah, I was explaining it in terms of regular templates, but you can do non-type template parameters.
Okay, but it is only the template parameters,
only the bits between the angle brackets
if we will. Yes.
But that gives you a lot of power
as you can imagine.
In particular,
when you mix it together with
lambdas in
unevaluated contexts,
being able to do the decal type
of a lambda expression, and
in particular the decal type of an
immediately invoked lambda expression,
then basically you can
write an arbitrary lambda,
returning an arbitrary thing, doing
arbitrary computation, immediately
invoke it, stick the whole thing in a decal type,
pass that to a function template,
and now you have the ability to provide for your users
to inject types at this JIT time.
Because inside the Lambda, you can have a local struct,
and you can return the thing of that type,
and you can put functions in there, in your struct, inside the Lambda,
and you can apply concepts to this type.
So you can say, user, give me a lambda,
write this object in your own way for your use case,
but I expect it to adhere to these concepts.
I expect it to be some kind of IO thing,
so I expect it to have a read function
and a write function, for example.
So at which point do you get the compile time error if something went wrong versus the jit time error i guess
so you get the yes you get a compiler error a jit time compiler error if if there was a malformed
piece of input there right, as you might expect.
And so it does sort of involve sometimes turning your users into C++ programmers.
They're used to working in a certain domain.
They need to get a little bit more used to C++ to know what the error messages mean sometimes.
So do you have practical, where you've actually seen this be useful kind of things that you can talk about?
Well, I can't get into the details, but yes, you know, exactly.
Kind of if you extrapolate from what I've been saying, you know, imagine that you have a large amount of data that you want to crunch.
You want to find, you some statistics out of out of
the data right um it puts the power it puts the choice of how to do that in the hands of the users
rather than having to bake in everything ahead of time so the user only wants to look at this field
out of data and all these other five fields they don't really care about this time they get to do
just that through through jitting what they
need so the the type that you would pass to your template might be something like we would consider
dependency injection it's something that's i think that might be a good word yeah yeah where you could
say okay here is a class that i just created just for you template and it has a compute function and now it's going to
be able to highly optimize that whole
flow because that compute function
is known as JIT compile time
for this set of data.
That's a good way to put it.
And there really is no difference
because it's still Clang
under the hood. It's still the compiler.
And it's still
you can actually
pass in jit time compilation flags as well usually we match them to the to the ahead of time compilation
flag so for example you know every oh three you pass in oh three everything is optimized with oh
three um so it's literally the case that the jit time code is as optimal. It's sometimes more optimal, right?
Because you know more at JIT time than you did ahead of time.
But it's, you know, in theory, absolutely no different.
The code is just the same.
It's just coming out the same compiler as it were.
Okay.
Now, once you have this object file, I mean, I've played a little bit with embedding Clang
in a project and I'm like, okay, I created an object file.
Now I need to link it.
And it seems like invoking the LLD is a little bit painful
if you want to invoke it from inside your program.
I mean, if you want to embed the linker versus calling the external linker.
Right, right, right.
So I'm just curious if you solved that problem by calling the external linker
or if you tried embedding LLD also.
No, we didn't embed LLD.
We have some driver scripts that make the binaries out of the emitted IR code.
I would stick with that plan.
Yeah, that seemed like a good idea.
A better idea than trying to do everything in one.
Compartmentalization of concerns that's
it's a good thing right the project i'm working on just as an aside it has to be cross-platform
everything i work on has to be cross-platform and the goal is to make a single binary with
external no external dependencies so embedding lld then became the next phase of pain. Right. Sounds like fun. Yes.
What kind of limitations are there with this? I mean, how far have you
pushed it?
Most of the limitations come from
a couple of limitations.
Really, we're trying to use things
which are so modern
that they aren't quite here yet
in some cases.
So, Lambdas in unevaluated contexts is a very new feature,
and the compilers do an okay job,
but they don't cover all of the corner cases.
When you get into actually using this thing for real
and actually doing sort of arbitrary things inside Lambdas,
you can expect sometimes to run into run into issues um
one of the things that we do that's not in standard c++ yet we're hoping it will get in is
allowing templates in local functions is that right or in local yeah templates and local functions
templates and functions yeah has still not been allowed i mean we get around it with generic
lambda generic lambdas are templates yeah and they can be in local functions? Templates and functions, yeah, has still not been allowed. I mean, we get around it with generic lambdas.
Generic lambdas are templates,
and they can be in local functions.
Except they're not.
They're defined outside the func... Well, whatever.
Well, yes.
Anyway, the fact that you can put generic lambdas
already in functions indicates that
this probably doesn't have an implementation bottleneck
in a compiler.
And so it would be nice to have that so that is
actually a proposal out for that right now there is a proposal i'm not sure of its status so there
has been a proposal a while ago uh there were two proposals i think and they're about to be or have
been merged the two authors working on them were told to get together by the by the committee i
think and i feel like that restriction is like you can't –
I've tried to work around it because you can have a local class.
That's not a problem.
But you still can't have a template function inside of a local class
inside of a function.
It'll be like, no, you're cheating.
You're not allowed to do that.
Right.
Yeah, and there are ways around that, as you say, with generic lambdas.
Yeah.
Which really are templates inside.
If you're interested in playing around with the JIT,
is there somewhere on GitHub you can go to test this out?
There is Hal Finkel had a fork of clang.
As I say, his paper, we're using version one of his paper,
and his paper has progressed since version one
and has changed in a few ways based on feedback from the committee and others, I think.
Actually, I'm not sure offhand where to go, but we can dig up a link. Okay. Put it in the show notes, I think. Actually, I'm not sure offhand where to go, but we can dig up a link and put it in the show notes, I think.
Since you mentioned the paper again, how is the committee responding to this
paper? Do you know? I don't know.
I don't know how actively... Clearly, this is a massive feature
and I don't think committee has time
right now to review this feature like it with all
the others probably um i don't know if it's something that we'll get eventually i don't you
know i i'm not optimistic about it being in the standard inside of 10 years right but yeah after
everything else is done we'll get to that paper. Yeah, maybe. Maybe. After Bjarne
gets operator dot, which is probably
at least nine years away at the
moment.
Okay. And then you also did
a talk at CPCon
about algorithms, right?
Yeah. I did a talk about
how to construct
generic algorithms.
And what are some of the guidelines you go over
about constructing your own generic algorithms?
Yeah, well, that was an interesting talk because...
So the first thing you get whenever you do a talk
that involves that sort of problem,
there's the tendency for people to...
You know, I got messages on social media after the talk saying hey you know good talk
did you know you could also solve the problem this way people get people get caught up in how
you're solving the problem rather than the process of constructing the the generics if you like um
so yeah the idea behind that talk was not really to say this is how you solve this problem, but rather to say, here are some things, you know, given that you have solved a problem in your code in a non generic way. Here's how you can alter that code. Here's how you can make it generic without losing speed without losing, and in some cases, adding generosity. So making it useful for more use cases. In particular things like looking at, you know, the obvious thing is, you know,
you convert it to a template and then it works with iterators and or ranges and you have to decide,
do I want to provide this for forward iterators or random access iterators or bidirectional
iterators or what strength and what are the various performance trade-offs there perhaps.
And then there are things like affines-based types
like Stoodchrono
has, where you look at
separating the types.
So
Adi Shavit and Bjorn Fala had a great
talk about this.
The idea is that sometimes
you shouldn't
assume that a type is one thing.
For example, you've got a function that takes a couple of arguments.
And when you're writing it first, you think of those arguments as having the same type
because you just want to add one to the other or do something like that.
But affine space types are two pairs of types that work together to form an affine space and
what that means is it's a space it basically one of the types is a point type meaning a position
in the space and one of the types is a vector type meaning a length or a direct or in duration
so in chrono we have time point and duration and if you look in the standard you can find that you can
add a duration to a time point you can take away duration from a time point if you subtract two
time points the result is a duration if you add two time points well you you can't do that because
it's meaningless to add points right even though these things have the same representation under
the hood even though they're just ints or floats when you get down to the registers.
The type system gives you this ability
to encode what's sensible for the types.
And so how that converts to generic algorithms is,
you know, I have a generic algorithm
that works on integers.
I like to see if it works on chronodurations,
for example, because that's quite often
a useful use case.
And if it works on those, does it work on time points?
And what does it, you know, sort of my go-to test case for making sure this A-fine space
stuff kind of shakes out correctly is time point and duration.
Okay.
So you said, you started that by saying you tend to think of two types as the same, and
you're talking about like reducing the requirements, I guess,
on your algorithms?
Yes, yeah.
So, or making them sometimes more generic.
You know, another good example of this is when ranges,
we get ranges in 20, but before that in 17,
we got the sort of relaxation of begin and end iterator,
stuff for like range force.
So instead of being two iterators,
it's an iterator and a sentinel now okay and so we've more correctly more more more generically realized
what's required of that of that end iterator as iterator as was but now we call it a sentinel
and what's required is really that it's comparable to the begin iterator and the other position.
But that end iterator never gets dereferenced, obviously,
never gets incremented or anything like that.
And so now we've sort of more cleanly recognized that
and scoped the requirements on that type.
And therefore allowed things to be more generic.
The functional helpers like plus and minus and less than,
and those things pre C plus plus 14,
I think it was maybe it was 11.
They required both sides to be the same type,
but now they've got a generic avoid type,
whatever for the template instantiation that lets them be two different
types oh right okay sounds like a similar kind of scenario yeah could well be i'm curious about
something as you're talking about algorithms um and your interest in compile time stuff
if i recall correctly back in c++ 11 time frame you re-implemented the standard algorithms as in
constexpr c++ 11 constexpr yeah i did uh i have a github repo which is now lying fallow which uh
was experiments around that yeah did it did you did it all work eventually is what i'm curious
about uh well i a qualified all.
All that I bothered to implement, and I wasn't rigorous in any
sense. But as you know, most
of the algorithms were made constexpr just by putting the keyword constexpr in front of them.
Right. And so in that sense, although, you know, that was in C++
14, constexpr14, when it was
not just a single expression.
But yes, I mean,
back in C++11,
it was a lot of recursion,
a lot of use of the ternary operator,
a lot of simple
linear recursion on sequences.
But yeah, it basically worked,
I think. As I say,
I haven't really kept... The GitHub repo was an experiment. It's out there, but it worked, I think. As I say, I haven't really kept...
The GitHub repo was an experiment.
It's out there, but it's lying fallow.
Just a random aside, I thought about it while you were talking about this.
Yeah, I had a lot of fun doing that.
And somewhat masochistically, perhaps,
I also did implementations of MD5 string hashing
at compile time in C++11.
And that was interesting.
Someone just shared with me a whole CRC library
that is 100% constexpr.
That's interesting.
Maybe something for us to talk about in a future episode, Rob.
Okay.
Well, Ben, it's been great having you on the show again today.
Is there anything else you wanted to talk about before we let you go?
Well, my company is hiring.
We're actively looking for good people all the time.
So I can put a plug in for that.
Quantlab.com slash careers.
So, and in fact, my team is hiring.
So, you know, if you'd like to work on a team using the most modern C++ that compilers support,
solving really interesting problems, then drop us a line.
And a team that cares about good code quality as well.
Yes. Yeah.
I have to say, actually, the team I work on right now is really one of the best teams I've worked on in my career. And it was a definite, you know, even, so when I worked in the games industry,
I was taking an interest in modern C++ and I was doing a lot in my spare time
and I was starting to do conference talks.
And then coming, making a career change, coming up to the team I'm on now,
that also was a sort of sea change in the style of C++ that I was writing.
So I learned a lot.
Cool.
And I continue to.
Okay.
Well, it's been great having you on the show again today, Ben.
Thank you.
Thanks for having me.
Thanks for coming on.
Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast.
Please let us know if we're discussing the stuff you're interested in,
or if you have a suggestion for a topic, we'd love to hear about that too.
You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. You can also follow me
at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help
support the show through Patreon. If you'd like to support us on patreon you can do so at patreon.com cppcast and of course you can find all that info and the