CppCast - Concepts and Algorithm Intuition
Episode Date: November 19, 2020Rob and Jason are joined by Conor Hoekstra. They first talk about new and updated libraries in Boost and Herb Sutter's trip report covering news from the recent virtual ISO plenary meeting where the f...irst new features were voted into C++23. Then they talk to Conor about some of his recent conference talks on Algorithm Intuition and Concepts vs typeclasses. News Butano: a modern C++ high level engine for the GBA New Boost libraries in v1.75 Trip report: Autumn ISO C++ standards meeting (virtual) Links Conor's Conference Talks ADSP: The Podcast Sponsors PVS-Studio. Write #cppcast in the message field on the download page and get one month license PVS-Studio: analyzing pull requests in Azure DevOps using self-hosted agents Why it is important to apply static analysis for open libraries that you add to your project Use code JetBrainsForCppCast during checkout at JetBrains.com for a 25% discount
Transcript
Discussion (0)
Episode 274 of CppCast with guest Connor Hoekstra, recorded November 19th, 2020.
Sponsor of this episode of CppCast is the PVS Studio team.
The team promotes regular usage of static code analysis and the PVS Studio Static Analysis Tool.
And by JetBrains, the maker of smart IDEs and tools like IntelliJ, PyCharm, and ReSharper.
To help you become a C++ guru, they've got CLion, an Intelligent IDE,
and ReSharper C++,
a smart extension for Visual Studio.
Exclusively for CppCast,
JetBrains is offering a 25% discount
on yearly individual licenses
on both of these C++ tools,
which applies to new purchases and renewals alike.
Use the coupon code JETBRAINS for CppCast
during checkout at jetbrains.com
to take advantage of this deal.
In this episode, we talk about new boost libraries
and the virtual ISO plenary meeting.
Then we talk to Connor Hoekstra.
Connor shares his thoughts about concepts
versus type classes and much more. Welcome to episode 274 of CppCast, the first podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
I'm all right, Rob. How are you doing?
Doing good. Don't have too much to share myself.
We're getting close to Thanksgiving under these strange times.
Well, I believe our guest today already had thanksgiving actually
it's true it is celebrated in other countries differently i guess right yeah yeah uh we'll
get to him in a moment then but first uh at the top of our episode i'd like to read a piece of
feedback um so we had uh theresa johnson on a couple weeks ago talking about thin lto and i think we may have
asked the question at one point like if she knew what msvc's linker was like right and uh we got
an email from kenny kerr we found on the show a while ago and uh he wrote hope you're well uh i
wanted to point out that following a recent show on ThinLTO, someone was internally asking about Visual C++
and asking when it would catch up.
He says he hasn't listened to the show yet,
but wanted to let us know that Visual Studio
has had multithreaded compilation
and multithreaded linktime code generation for years.
And if anything, ThinLTO is more like
how linktime code generation works today in Visual Studio.
And he says the Windows builds rely heavily on this.
Otherwise, it would basically take forever to compile.
And yeah, maybe we'll have someone to talk about the Visual Studio linker in more detail sometime.
Yeah, I don't think we've ever really talked about that.
I feel like the open source linkers get a lot more of the conversation than the Visual Studio ones,
which a lot of our listeners use.
Yeah, but it's certainly good to know that
they have this link time code generation feature option.
So if an LTO sounded great and wonderful to you,
Visual Studio has similar support.
Right.
Okay, we'd love to hear your thoughts about the show.
You can always reach out to us on Facebook, Twitter,
or email us at feedback at cppcast.com.
And don't forget to leave us a review on iTunes or
subscribe on YouTube. Joining us
today is Conor Hoekstra.
Conor is a Senior Library Software Engineer at
NVIDIA working on the Rapids team.
He is extremely passionate about programming languages,
algorithms, and beautiful code. He is the
founder and organizer of the Programming Languages
Virtual Meetup, and he has a YouTube channel.
He also recently announced at Meeting C++ and C++ Russia
that he is starting a podcast with Bryce Adelstein-Lelbach
called the Algorithms Plus Data Structures Equals Programs Podcast.
Connor, welcome back to the show.
Thanks for having me. This is super exciting.
And this is another freaking competitor to our podcast, Rob.
So to be clear, we have our own competitor we don't compete with
cpp chat or cpp cast our our i didn't declare these our competitors this is bryce's doing but
there's another podcast uh that's been kicked off by jeff bastion called tlb uh h or tlb hits but
yeah domain is tlbh.it that i think hannah dusakova helped them set up
um and yeah so bryce bryce's one sort of talking point to me was like as long as we're better than
jf's podcast i'll be happy so we're apparently competing with them not with you um good to know
good to know so have you released any episodes yet? we have recorded our first episode
and I'm in the midst of editing it and setting up
all the podcasting things
but by the time your listeners are listening to this
it should be online and if people
want to check it out
I'm sure the information will be in the show notes
I'd just say as we've told many
other people consistency is the key
if you do one episode and then never
release another one again you probably won't get a lot of listeners. Yeah, ours, the goal for me is
for it to be a very low overhead podcast. So it's inspired by a podcast called Magic Readalong,
which is just two guys that sort of hop on a phone call for 20 minutes. And like half the time,
they're not talking about anything
programming related. And then every once in a while they'll just drop into category theory.
So half the time you're hearing about what they're eating for dinner or barbecue. And the other
times, you know, you're hearing some interesting things about adjunctions and, you know,
catamorphins and stuff like that. So hopefully it's just going to be that, you know, little,
very little editing. And with that low overhead, hopefully it'll be easy to pump them out, you know,
once a week, like you guys do. Yeah. Sounds cool. And sorry, but happy Canadian Thanksgiving. What
was that? Two weeks, two weeks ago. Uh, it was back in October. So yes, the real Thanksgiving,
the Canadian Thanksgiving, um, uh about a month earlier depends because
it fluctuates from year to year for for us i'm not sure if it's the same for you for american
thanksgiving it's the it's always a third thursday thursday yeah or whatever yeah something like that
right yeah it should be interesting to to see or hear how many people are
visiting or how many people are just celebrating individually yeah yeah
we're getting yeah we're getting a pretty significant phase of real closures in colorado
right now leading up to this um so yeah we'll see what happens yeah okay well connor we got a couple
news articles to discuss uh feel free to comment on any of these, and we'll start talking more about what you've been up to lately.
Sounds good.
Okay, so this first one is a new library.
It's on GitHub, and it's called Butano,
a modern C++ high-level engine for the Game Boy Advance.
We've talked about emulators a lot in the past,
but I've never heard of Game Boy emulators.
But this one looks pretty neat.
It's not a Game Boy emulator.
It's a C++ library for writing Game Boy games.
Okay, okay.
I misread that then.
So do you need an emulator or is a Game Boy Advance something you can kind of still easily get and put a game onto?
I would say you can easily get a game boy advance
getting a game onto it is slightly harder you have to get one of the hobbyist cartridges that
lets you put an sd card in it or write write your own rom right um but if you use the this tool to
program and then launch in an emulator that would would by far be the easiest way to play with this.
Right.
Very cool.
I discussed this with a couple of people that came up at some point recently.
It might've been at meetings.
He plus possibly hanging out.
I'm not sure.
And someone's like,
Oh,
well that's,
that's too new for Jason basically because Game Boy,
Game Boy advanced is like an arm seven processor.
So it's actually got wide support with compilers,
with GCC and Clang, which helps a lot for this.
I'm not trying to downplay the achievement or anything,
because something I've actually thought about doing was,
like, how hard would it be to bootstrap some C++ code on a Game Boy Advance,
and then I gave up shortly after I came up with the idea.
So I didn't mess with it again.
Do you do any Game Boy Advance programming, Connor?
I do not know.
I didn't even have a Game Boy.
I was in school when they came out.
But yeah, it was typically at recess.
You'd have four or five kids in class that had a Game Boy
and then everyone would be huddled around them
while they were playing their Pokemon Blue or Red
or whatever those games were called.
I did not play Pokemon either, so when the AR game came out
and everyone was freaking out, I didn't really have the context
of catching them all when I was a kid.
I agree. I never played either.
Did you get in on that, Rob?
No, I didn't have a Game Boy as a kid either.
No, but did you get in on the Pokemon thing?
No, no.
I've never been into Pokemon.
I think I watched some of the cartoons as a kid, but never played the games.
Okay.
I have an anecdote about Pokemon Go.
I had worked with an intern once in my first job that had written a small program to send him texts whenever there was a Pokemon that was within a 10 minute walking radius that he currently didn't have in his whatever library. And so every once in a while, I'd see this intern jump up from his desk and bolt and say, I'll be back.
And he'd be running out to get some rare Pokemon that popped up a couple blocks away, which I just thought was fantastic that there was this Pokemon Go API you could hook into to do stuff like that.
There's still people who are really into that game.
My wife and I went to the park a few weeks ago and saw all these people heads down on their phones.
And there was a Pokemon Go meetup going on at the park yeah i think it's fantastic gets uh gets people out doing more
exercisey things yeah yeah true okay uh next thing we have we haven't talked about boost in a while
but uh we have this article uh that version 1.75 is coming out soon, I think.
It says in progress.
I'm not quite sure what that means in terms of their release schedule.
But they are adding a few new libraries with this upcoming release.
So there's going to be a new Boost JSON library,
obviously for JSON parsing and serialization.
And there's also Leaf, which is a lightweight error handling library,
and PFR, which I feel
like we've heard of leaf before.
It sounds familiar.
I don't know.
No, I don't remember.
As usual, I got distracted by the first thing and went down a little rabbit hole looking
through the source code of the boost JSON library and looking at how their allocator
support works because allocators are something that I've been thinking about lately.
And then I forgot to come back to this particular article
and look at the other two libraries.
Well, how does the Boost.json library source code look, Neo?
It looks very boosty.
There's lots of macros for boost things
and boost things used in the boost library.
But other than that,
it looks like it has fairly straightforward allocator support
for passing in your own
memory resource and using your own kind of allocators
with it which is the part that I was actually interested in
Connor do you have a chance to look at any of these?
Yeah I took a glance, the one thing that caught my eye was the
addition to the MP11
metaprogramming library
so Barry Revzin had requested an algorithm that
ended up getting named pairwise fold, which is the sort of equivalent, the metaprogramming
equivalent of adjacent difference with a different name. So it takes adjacent elements implies and
applies a binary operation to each pair of elements.
So interestingly, because Barry actually messaged me a couple weeks ago about this when it got added.
And then I was like, why did you call it pairwise fold?
And then he didn't actually realize the name that it had ended up with.
I think he initially requested ziptail-width. So zipwidth is an algorithm from Haskell
that takes two ranges and applies a binary operation.
And ziptail is a trick where you do that
sort of the first one, zero to n-1 elements,
and the second to n elements,
which gives you adjacent pairs when you zip those two ranges.
So ziptail-width is sort of the combination of zip tail and zip width.
And then they threw around a couple of different names like pairwise.
I could get this wrong.
I think it's from Kotlin or some other language that is the equivalent of sort of like adjacent
elements.
But they ended up adding fold to it and it's not a fold.
So when Barry realized that he was like, was like oh no that wasn't one of the
names we actually considered we considered pairwise but somehow fold got added on to it so
um that's an interesting story so mp11 is a compile time algorithms right metaprogramming
compile time i believe so yeah so i mean without looking at this that that implies to me that it's taking like two sets of types and doing a pairwise operation on the types yeah i think potentially that is
that is one one thing you can do with it okay but not a fold yeah i mean it's interesting like the
naming history on these things like in in the textbook that i'm working my way through sick
p the structure and interpretation of computer programs they call a map, so like our equivalent of a stood transform, that takes a variadic number of sequences.
So our stood transform can take one range or two ranges.
But hypothetically, you could have a stood transform that takes any number of sequences.
And then instead of a unary operation or a binary operation to transform
those it can just take an n-ary operation um uh so a lot of languages they just call this map
python uh and closure and racket so lisp dialects and python do that in the structure interpretation
of computer programs they call it accumulate n um which coming from like c++ land is very
confusing because you would think that like
typically underscore n for us means uh on a subset of the first n elements of your range
and accumulate is our our name for reduction um but when you think about it like when you're
when you're transforming uh n like n elements from n sequences at a time down to one uh it's not really you're not folding one
by one over the elements but you could argue that that is in a sense a reduction um and then so
they're sort of basically saying it's a you're accumulating over n of those elements anyways
like there's there's a bit of a reduction a bit of a transform and every language sort of makes different decisions so uh naming is
hard as they say right it's hard okay uh last article we have is a trip report a virtual trip
report i guess from uh herb sutter and this is about the autumn iso c++ stands meeting which was
held virtually and we briefly talked about this a few weeks ago when we had John Heed and Aaron and Peter on.
And yeah, it's great to hear
that they were able to actually vote on some papers
for C++23 during this virtual meeting.
And as I just mentioned, John Heed,
he was actually, his paper was the first one
to get voted into C++23, which is pretty exciting.
He has his paper PO330, which is
going to add a UZ literal suffix to the language
for numbers of size T type.
Yes, I mean, I needed this suffix like
four times on Monday, basically. I'm like, static
cast, static cast. I'm like, static cast, static cast. All right, like, yeah,
I'm looking forward to this. Although it makes me wonder if there's UZ for unsigned size T,
I mean size T, which is unsigned. I knew there had been some discussion about like an SZ or
something also for signed size T, which is in places in the standard also it just makes me wonder if there's a signed
and an unsigned version or not but i didn't read the paper to find out connor you have any info on
that are we getting both signed and unsigned versions of this ah it looks like uz and z sorry
oh okay yeah it's good to see um the committee's still making progress i know there are differing camps on um whether
c++ 23 we should still be aiming for it to there's a paper a while ago a bold plan for c++ 23 if we
should still aim for that or some sort of arguing we should slow down a bit due to covid times and
make this sort of more of a c++ 14 style just you know just polishing features that went into 20.
But yeah, most programming languages, whether it's designed by committee or sort of designed
by a core group of developers, and you can submit, you know, peps, whether that's for
Python or Jepps for Java, a lot of these languages already, you know, even before COVID, have their
development cycle on GitHub and online. So I'm really happy to see that C++ is still
making progress and slowly sort of transitioning to that model.
I should mention just a couple of other things that they did vote into c++ 23 we're also going to get a uh string contains
function which is another one that uh i think we should all be excited about that we'll finally
have so you can just say if string contains whatever as a part of as opposed to having to
you know write that by hand or something like that the find it does not equal and yeah yeah um and also did you note that richard smith is no longer going
to be the lead editor of the standard did not see that thank you to richard smith for his work for
many years as project editor for the c++ standard and completing c++ 20 this month oh yeah um
starting now we begin c++ 23 with thomas coop who has graciously agreed to step up as the primary
project editor while richard will stay on as backup project editor cool good for him we should
have either richard or thomas on we've never had either of them on no we have not yeah were you
involved at all uh in this meeting connor um i was uh briefly an attendee i had multiple meetings
going on at that time um okay um okay. Um, but I have been meeting
with, uh, a small sort of algorithm slash ranges group for the last few months. Um,
sort of working on the paper that ended up going into the most recent mailing.
And we'll discuss that in a minute, I believe. Okay.
Okay. Well, why don't we start talking more about, uh, what you've been up to lately,
Connor? I know we just had Meeting C++ 2020 last week,
and you gave a new talk there.
Is that right?
Yep.
Yeah, I attended not all of the conference,
but it was, I think, over three days.
And I think we all saw,
or I definitely saw Jason at a couple tables.
And were you there too, Rob?
I can't...
Yeah, Jason and I both did a Q&A
and hung around for a bit after that.
Oh, yeah, that's right.
Yeah, I caught the second half of your Q&A
because I was in and out of work meetings.
Yeah.
How did you guys find Meeting C++ online?
I didn't stick around for much more of the conference
after our Q&A, unfortunately.
I just had too much other stuff going on
to be an attendee this year. Yeah, I also, I mean, I hung out for a while and as always, it's good to
see people even if remotely who we haven't seen in a while. Um, and it was fun doing the AMA.
I think, uh, I'd do that again. Um, I think it went well. Yeah. Yeah. Yeah. I guess the time
zones are a bit hard to for us cause it starts i think 2 a.m locally
for me which means even earlier for for you yeah it would be like midnight that it was starting for
me which so yeah not not ideal times um no not ideal we're gonna talk to jens about that what
was he thinking yeah really well so what was your uh your talk this year, meaning C++? So this year, I gave a talk called C++ concepts versus Rust traits versus Swift protocols versus Haskell type classes or some permutation of those language features.
I don't recall the exact order. And this was a talk that over a period of like two years in my head of sort of watching different videos and reading different content,
I had this realization that the C++20 concepts facility that we were getting has existed in other languages for decades or something very similar and has been added to more modern
languages like Rust and Swift. The first time, so I mentioned this all in the talk, I'll give like
an abridged version of it, but the first time I sort of realized this was in 2018 when I started
learning Haskell. I learned about type classes, which basically let you specify the class of a generic type,
and then basically enables you to do certain operations with that generic type.
So when you have a generic function in Haskell that just as a single input takes a generic
type A and returns as output a generic type A, you can really only do one thing with that function at haskell
which is return it because you don't know that you can do anything else with the generic type
so that's a function in functional programming called identity and i explicitly remember
learning about this on another podcast called lambda cast which i actually started listening to
because you mentioned it once uh in in your sort of introductory news items in
one of your episodes. And they sort of talked about this identity function. And it sounds like
the most useless thing. Like, why would you ever want a function that does nothing and just returns
you back what you have? But if you start delving into the functional programming world, it actually is very useful in certain cases. One of them, for instance, is it doesn't exist in the
STL, but in the CUDA equivalent of the STL, Thrust, we have algorithms that take basically,
in some cases, a third range, which is known as a stencil. It's very, very similar to projections in C++20 range algorithms. And so you can do something like a copy if that takes two ranges
where it copies elements from the first range based if the unary predicate returns true when
applied to the second range. So in a std copy in the STL, you always have to apply the unary predicate to the
range that you are specifying because you're only allowed to specify one. But an alternate design
for that algorithm could be taking two ranges, the second one which you don't copy but you're
just checking. So it's sort of similar if anyone out there is familiar with Excel to a function
in Excel called sumif if it basically sums up elements
in a certain range based on another range. It's the exact same idea. But sometimes, for a copy,
if you're just copying, but if you have some equivalent of like a transform, the thrust
identity might be actually what you want, in that case, or stood identity, which we're getting in
C++20. Anyways, back to Haskell, that generic A to a generic A, or std identity, which we're getting in C++20. Anyways, back to Haskell,
that generic A to a generic A, it's called identity. And this the type classes give you
the ability to say, well, my generic type A is in the type class of num, which sort of means stands
for numeric. And now you can do more with it. You can add them together. You can do things that you can do with numeric generic types. And then ultimately in the talk,
I sort of show the Swift protocols and Rust traits. I also include D type constraints.
And sort of the points of my talk that I really try to drive home is that in this broad category
of what in sort of academia is referred to as constrained parametric polymorphism,
there's two categories, the concept model, which I sort of term type constraints,
and then the type class model, which I just sort of stole the name from Haskell. And so Rust,
Swift, and Haskell fit into the type class model, and D type constraints and C++ concepts fit into the type constraint model. And so the
difference is that where in Haskell or the type class model, you start with a generic function
that you can't do anything with, and then you have to specify the type classes in order to do
more things with it. In the concept model, it's the exact opposite. You start with a generic function in C++, you know, now you can just go auto F
and with a parameter, you know, auto P and that returns, you know, an auto P.
So it's a generic function that, you know, takes a single argument and has a single output.
But you can do absolutely anything with this. It might not compile, but you don't need to do any extra specifications
in order to say, you know, oh, this can work with some, you know, it's got a dot name method or a
dot area. So the example in my talk I show is having a shape concept and then like a circle
class and a rectangle class that adhere to this concept. In C++, you don't need to specify that shape concept in order to get the code to work.
You start with the world, and then you constrain with a concept what you're able to do with that
generic type. Anyway, so they're sort of completely opposite to each other. And I find it very
interesting that in certain languages like Swift, they refer to this parametric polymorphism as constrained generics when really,
so when they say unconstrained generics, that's a generic function that doesn't have any constraints
on it, which you can do nothing with. And then by adding the constraints, you can do more,
which sort of seems like they have the terminology backwards. And I've talked to a few people and
they've shared like motivations for, you know, potentially why they did it that way. But I think that's, it's a key thing to sort of
understanding that in C++, we start with the world. And by adding concepts, you are basically
restricting what you can do with your function. But in many other languages, it's the exact
opposite, you start with a generic function that does very little, and then you enable it to do
more by adding the constraints to your generic type. So that's sort of in a nutshell what the
talk's about. Interesting. You get good feedback, good questions from the audience? Yeah, there was
a few questions at the end. It was originally a 90-minute talk that I sort of cut 20% out of,
and it was supposed to be a 60-minute talk So it was pretty, pretty quick cadence. But afterwards, we hopped into sort of a remote Q&A.
One of the questions I got was how come I didn't compare these language features to Java and their
interfaces. C sharp also has interfaces go also has a language facility called interfaces, but it's slightly different than what Java and C sharp have. Definitely, you can do a comparison between those. But I can't cover every language out there in a 60 minute talk. And I said, my primary motivation was that I was trying to choose languages that were adjacent to C++. And that operate sort of in the same space. So
definitely Rust, you know, a lot of people say in the same breath as C++. D, although a lot less
popular, is extremely influenced by C++. Swift is a very, very interesting language that I knew
absolutely nothing about going into sort of preparing for the talk. And it is a it is a very, very pleasant,
pleasant language to use. And I wish I wish they were focused more on sort of cross platform and
not just like mobile app development for iOS. Because it's it like when you start off with
Swift, it feels like Python. And Chris Latner, the designer of the language, he focused on one
of the values for the languages that he want for the language Swift that he wanted was something called progressive disclosure of complexity,
otherwise sometimes referred to as he wanted the language to be infinitely hackable,
so that like it feels like Python at first, which it truly does. Like you open up the playground
online, you type print parentheses, hello world and parentheses, like exactly like you would in
Python, and it works. But slowly, as you start to learn the language, you can sort of peel back the layers.
And it's it's an amazing experience, like coming from the world of C++, where we don't necessarily
have the best, like now, thanks to gobble, I think it's a lot easier to go and just start typing away
and getting started with C++. But like historically, C++ has not had a very good sort of like first impression experience
if someone's not holding your hand and you very quickly, like I remember the first time
in high school trying to get, where was it, university, it was a while ago when I was
trying to get started with C++ and it was just awful.
You had to go download things and get them all set up and then nothing worked for three hours.
And then finally, I think I just ended up
getting Visual Studio working,
which was the easiest way.
But yeah, there's a long way to go,
I think, in C++ in terms of our like
first impression experience that we have.
I kind of feel like collectively,
we lost a lot in the 80s to 90s transition where every computer you turned it on and immediately had a programming environment.
I mean, it was basic, but you had a programming environment as soon as you turned it on that you had to do some programming to use it.
Or you could very easily, if you chose to, just type things in and see what happened.
And yeah, now you're like, you know, that doesn't exist so much in
the same way anymore. Yeah, it's part of the reason I think why Python, you know, there's
many reasons you can stipulate why Python became so popular. But with basically like every Linux
distro, it's just automatically bundled. You just type Python and poof, you hop down into
the whatever command line editor or interpreter. And yeah, I think,
I think for any language that can get like automatically included in a bunch of operating
systems, that's going to give you a big headwind. Yeah, I feel like for those of us who are maybe
slightly older, it feels like Python completely replaced Perl somehow, because Perl's, I believe,
actually part of the POSIX standard. So you knew Per Pearl was always going to be there. And now I don't know. The one programmer that I knew who chose to do Pearl for all time is
now no longer doing Pearl after like 15 years or whatever it was that he programmed in Pearl.
Yeah, Pearl. I feel bad for Pearl. Like I only ever spent like half a week of my career working in Perl.
I had to debug Perl.
And so I don't know much about it.
All that.
Other than that, like everyone says it's one of, you know, the write once, read never, impossible to read languages.
But like I actually, I think Perl is one of those languages that like if you write it well, which I think the Perl code,
I wasn't an expert, but it seemed nice. And I really liked the single character,
although this is coming from someone whose favorite language is APL.
I was just going to say that.
The single character notation for this is a hash map, this is an array. I love that stuff.
And if you think about it, Python is somewhat similar to that.
When you go like variable equals braces, that's a dictionary.
When you go variable equals brackets, that's a list.
You have to go equal set for a hash set.
But I don't know.
I think Python developers know what that means.
When they see L equals bracket bracket, they know that that's a
list, which is somewhat similar to, you know, what Perl was relying on. So I don't know, I feel all
these languages that get bad raps, I just feel bad for them. We just need to we need to love all our
programming languages, even if they've fallen out of favor. Like, this, this programming language
needs a home. Would someone be willing to adopt it?
Yeah.
I want to wrap the discussion for just a moment to bring a word from our sponsor, PVS Studio.
The company develops the PVS Studio Static Code Analyzer,
designed to detect errors in code of programs written in C,
C++, C Sharp, and Java.
The tool is a paid B2B solution,
but there are various options for its free licensing
for developers of open projects, Microsoft MVPs, students, and Java. The tool is a paid B2B solution, but there are various options for its free licensing for developers of open projects, Microsoft MVPs, students, and others. The analyzer is actively
developing. New diagnostics appear regularly, along with expanding integration opportunities.
As an example, PVS Studio has recently posted an article on their site covering the analysis
of pull requests in Azure DevOps using self-hosted agents. Check out the link to the article in the podcast description.
Going back to your talk on concepts
and comparing it with these features of other languages,
as someone who has studied and worked with all these languages,
do you have a preference for the constrained way concepts work
versus the type classes where you have to start adding to the type classes in order to be able to use it?
Yeah, my personal preference is the type class model.
But this is like one of the classic sort of tradeoffs where people fall into two different camps.
There are people that like it's's the same argument in my opinion
as like dynamic versus static typing.
There are certain language facilities or features
that require sort of more upfront thinking.
And the trade-off is that
you're going to have static analyzers
that are going to be able to tell you more
and you have more sort of sureness of whether your code's going to be like correct tell you more and you have more sort of, um, uh, sureness of whether
your code's going to be like correct at the time you run it. Whereas, you know, with other languages
like Python and JavaScript, you can just sort of write it, it'll run and you'll find out at runtime,
uh, or like a month later if it works. Um, and I think there's use case, there's times when you
just need to prototype and get some, you know, proof of concept out the door. Um, and there's
other times where like being sure that something's running correctly is more important.
I tend to always fall on the side of like, I would prefer static typing so that, you know,
my compiler can do more for me. I would prefer to like specify the concepts on front and like
have the compiler force me to think about like, what do I actually want to enable my generic type to have? Whereas in the, in the C plus plus model, we now, and this is also sort
of a legacy thing, right? Like we, we are building this on top of templates, you know, potentially
they would have designed it, um, in the type class way if they could have. Um, and so I, I have no
like qualms with the design. I think it's probably a necessity. Um, but if, if I was starting from
scratch, I think like I prefer the model that, maybe let's spend a little bit more time thinking up front.
And let's get more guarantees later. Because I would, I would prefer to spend more time thinking
than more time like debugging later, is my personal preference. But like I said, like,
I understand the arguments for both. And there are some times when like, I've been trying to do something in language x. And I'm just like,
let me just try this in Python. And then like, it was 45 minutes wasted. And then five minutes in
Python, I have it working. And so and so yeah, there's there's a time and a place, I think,
for for different languages. So if you don't mind, if we go back in time a little bit here,
shortly, I think after we had you on the first time, you ended up giving the most awarded
C++ talk ever, I believe, if we were to quantify these things when you gave your algorithm intuition
talk at C++ Now, which was well regarded. And we discussed with Tony how jilted he felt because he
didn't win any awards that year. But you know, that's all fine. That's history.
Since then, I mean, you've developed,
you've done several talks now on algorithms and algorithm intuition,
and it's a huge topic,
but something we need to talk about
because it's been so well accepted talks.
What kind of things do you talk about in those talks?
What's your favorite talk?
Well, so the first thing I should say, because, cause I know now that you've mentioned Tony's
name, I'm going to hear about this from him.
Uh, I don't think Tony, uh, was actually at that, uh, C plus plus now 2019.
Um, so I had heard from him that he was a bit upset, uh, that, you know, first of all, I think a couple awards were added that year, like Best New Speaker, which I won.
And then he said that that's not really fair.
Like, you can't just add awards and then say, you know, because his claim was that he would have won that stuff in the past.
Anyways, Tony was all set to come back in 2020, and apparently he was going to reclaim his throne.
So the dust hasn't
settled on that i don't think it is all in the past i think it's in the future and we will see
the return of tony um at whenever c++ now happens again so um and i look i'm a huge uh he he emailed
me once um when i i joined the canadian national body and he said my nemesis and i was like what
like i'd never interacted with tony before and he's like two of my favorite talks are words of
wisdom and post-modern c++ um so but anyways it's it's all good fun um what do you actually won like
eight awards for that talk or something i don't i don't recall oh come on you have a stack of them i know you do well no
so john uh john called the organizer he says that there are uh plaques or something um but i think
he's got like a a few year like leg on sending those out so he said um but yes it was it was a
big surprise uh the talk was very well received um and i yeah it was
i'm glad that people enjoyed it so much um it was it was also my first talk too so i i was very um
i was very surprised like i was expecting like feedback and like oh you know try this next time
or try this uh you know or tweak it slightly differently. But yeah, so the the idea for the talk
initiated from a combination of two things. It was watching Sean parents C++ seasoning talk from
going native 2013, which probably like I've said, I mentioned in like every podcast or talk that
I've ever given. Because I think it's it's it for me, at least it was such an inspiring talk.
And it's it's Yeah, it's like 50% of the reason
that I ended up giving the talk.
And then I stole the title
from a previous CppCast episode.
I think it was the first time
you had Kate Gregory on episode 30,
where she had like a little thing she said
in the midst of the episode was that,
you know, we spend a lot of time
focusing on data structures, but not as much time focusing on algorithms.
And she said this quote that was like, for people that, you know, we need to develop algorithm intuition or like an intuition for algorithms, I think was her exact words.
And that in 10 years, people that haven't done this are going to be trailing behind sort of other folks.
And as a listener, like you hear that and you're like, oh, no, like FOMO, like, I'm going to be left behind my
employers no longer going to employ me if I don't know these algorithms. So I slowly started trying
to learn them. And then I sort of just went on this journey of having sort of insights about like,
oh, this algorithm is similar to this algorithm. so like one of probably like my biggest eureka
moments in the first talk um was when i was trying to solve this problem that i won't go into the
details but the gist of it is like you have two ranges and you're trying to uh index into both of
them and then do some kind of um either mapping operation or reduction operation um in this case
i think it specifically
was you were taking two ranges and then binding them together and ending up with a single value.
So there's sort of like a zip in there, and then a transform and then a reduction.
And in Python, they have algorithm called zip, which we do not have in C++ currently, although that is in the planned C++23 range algorithms to be proposed.
So we will get it in the future. But at the time, we didn't. And that meant that there was no way,
at least that I could see at the moment, to solve it with an algorithm. So I had sort of a for loop
for my C++ solution. And then for Python, I had a zip with sort of destructuring. So you're
zipping two ranges together using like the equivalent in Python of C++ 17 structure bindings,
and then using a algorithm combined with sort of a generator expression to do it in like a single
or two lines, which I thought was like super nice. And then two weeks after I had sort of made a YouTube video about how, you know,
Python's solution was much nicer than the C++ solution, I realized that we actually did have
an algorithm that implicitly was doing a zip, and I just had made had didn't know about it.
And that algorithm was inner product. And then sort of in algorithm intuition, or the second
follow up on better
algorithm tuition, I had a whole category of like, I think, five different algorithms
that implicitly have zips in them. So to pause on inner product, you can think of an algorithm
that takes two ranges and a binary operation and applies that binary operation to each element
at the same index at a time um that is essentially like a
zipping algorithm we just don't in any of these algorithms name them you know zip underscore
something um so inner products the first one i'm not sure if either of you two want to guess
what the other uh four are or we can we can play a game you can guess and then our listeners can
guess as well what's terrible is i've watched this talk and i can't tell you so the more the more general version uh of inner product is stood
transform so the equivalent of zip with currently in like the pre c++ 23 or C++20 range algorithms is std transform that takes two ranges.
So like std transform is your sort of map from functional programming.
When it takes one range, it's a map.
When it takes two ranges, it's a zip width.
And then inner product is basically like a, it's a std transform also with a reduction
after it.
So like a std transform takes with a reduction after it so like a stood uh a stood transform
takes just a single uh lambda or function object um that either applies to one range or two ranges
stood inner product or as it was renamed in c++ 17 stood transform or to reduce uh takes two
lambdas or two function objects one of them does the transform and then one of them does the
reduction wait wait are you saying inner product and transform reduce are the same algorithm?
Not identical, but basically stood transform reduce is the parallel, renamed version of
inner product. Yes. Okay. And like inner product, I know many, I think Matt Gobble mentioned it in one of his talks that he gave at a vast meetup that Hanna Dusakova runs.
Matt Gobble gave a talk called, I think, C++, the old or the new old language or something like that.
And then also Ivan Trukic, who wrote the Functional Programming in C++ book, and he's given a number of talks.
He mentions it in one of his as well, that like inner product is a really bad name.
And I think Alexander Stepanov chose it because he has a strong math background.
And this algorithm is called inner product in like the world of mathematics.
But really like as a generic algorithm, it's a bad name for it,
because inner product is like a domain specific name, when really what you're doing is taking
two ranges, you're transforming them into a single range with a binary operation,
and then you're reducing it down to a single element with another binary option, binary op
in the form of what they call a reducer. And I guess I should note also that the C++17
std transform reduce is also different in that it has an overload that takes just a single range.
So inner product, you need two ranges. But std transform reduce, similar to std transform,
you can give it either a single range that it first does a transformation on and then a reduction on, or you can give it the two ranges similar to std inner product.
And the reason for that is that it seems like why would you ever want a std transform or a std reduce and do the transformation in the reduction lambda or function object.
But the thing with the C++17 parallel versions,
the transform reduce and reduce,
is that there are strict requirements on these algorithms
because they can be performed in parallel.
So the binary operations that you provide to it
have to be both commutative and associative,
which means that you are very restricted.
So for example, if you wanted to do a parallel,
you wanted to calculate the total length of strings
in a vector of strings.
In pre-seep, so you've got a vector of strings in a vector of strings um in pre-seep so you've got you know a vector of strings that's
got you know a dog cat elephant and mouse and i don't know what the total it's roughly like 10
or 15 characters you know dog and cat are three characters and the idea is you want to get the
total length of all the strings in in pre c plus 11 or c plus plus, like parallel algorithm world, you could just do a std accumulate
that basically has an accumulator
that's going to be the running sum of the total lengths.
And then inside, you're going to go return of your lambda.
You're going to go return accumulator plus string.size.
So like in the arguments of your binary operation,
that it's a reduction,
the first element's your accumulator,
which is going to be like an integer or a size.
And the second parameter is going to be a string.
So that, when you have a binary operation
where the types are not the same,
it's definitely not commutative and associative.
So in order to do this in parallel,
what you would need to do is use a std transform reduce,
where the transform first turns the strings into lengths
of either you know size t or integer and then the reduction binary operation is now going to be
accumulator of size t and then also an element of size t so you do the transform first so that you
can have a commutative and associative reduction binary operation. Hopefully I didn't just put all the listeners to sleep.
Well, it helps from my perspective that I had actually the,
once you started talking about transform reduce,
I brought up the CBP reference page with the,
the overloads that are available.
So I was glancing down at that and I was making sure I was following along.
Yeah.
So hopefully if uh if you're
on a peloton you can switch your your tree uh your park that you're in to the the cpp reference and
if you're on a run i apologize um this actually is very apropos uh for me because just on monday
tuesday i was prototyping some code and i'm like i know there's
an algorithm i should be using here but i needed to do an offset over the same range and with this
to discussion about transform reduce and accumulate i'm thinking oh if i just swap my reasoning a
little bit i could just simply apply an algorithm to it it'll all fall out how i want it to so i'm gonna have to take another look at that later so coincidentally uh for what
you just explained inner product and transform reduce is exactly what you want for that yeah um
so like one one of the common uh when someone once came up to me after a talk and we're like
oh here's a problem you can't solve with an algorithm, which was like, basically, bring it on.
I prefer the more like, hey, is there an algorithm?
The more like, hey, let's try and solve this together.
But I also don't mind challenges.
But it's the, you know, given a array of elements, calculate like windowed sums. So if you've got a list of 20 elements, calculate, you know, the sum of five
elements at a time, from like first zero to four, then one to five, then two to six, etc. And like
very quickly, if your ranges and windows are big, you can get quadratic time complexity,
if you're doing the sum sort of repeatedly, because you're having overlapping windows so
ideally you don't need to like recalculate the overlapping part and so
typically if you're just doing it with a for loop you calculate the sum of the
first five and then for the next one you subtract the first element off and then
add the next element so you're just doing subtracting one number adding the
next number and then you're gonna get something that runs in linear time you
can do this by first doing a stood accumulate on the first five or n elements that you need in your window size and then do a
transform reduce where basically uh you're carrying along some state of the previous element and you
start one range at the beginning or at the beginning and one range uh at the sort of end of
your window or so i don't even think you need to carry state because your first
iterator is going to point to the element you want to drop and then the next iterator is going to
point to the one you want to add right um so you can actually do it it's a little bit you're
twisting the algorithms in a way that you know typically you're when you use inner product or
transform reduce you're iterating over two different ranges over like the same sort of index
um but yeah you you can twist algorithms to do all kinds
of crazy things. Well, it sounds very much like what you would need for like a moving average
window. Like when I'm looking at my YouTube stats, and I'm looking at my 30 day moving average. I
mean, that's basically the same thing. Yep, a lot of a lot of like finance applications and
programming language that are specific to finance, like have those algorithms built in because
you can do them extremely efficiently. And if you're not careful, you can do them extremely
inefficiently. But a lot of them just have it built in. And like, I know Q is a an array
programming language that's used by a lot of hedge funds. And they in built into it have average and
ma VG, which is for moving average, because that's stuff they use all the time for like
ticker symbols and stock prices and whatnot. I think it's worth noting before we move past
this also that every algorithm we've just discussed are the ones that are actually
hiding in the numeric header, not in the algorithm header. So if you go to look for them...
So yeah, this was the sort of difference between my first talk, which I aimed to be on the algorithm header,
but then I ended up falling in love with all the algorithms in the numeric header,
which is sort of what I talked about.
So inner product, accumulate, adjacent difference, iota, and I'm missing one.
What's the fourth one?
Inner product.
Oh, well, someone's going to tell us online.
And then the second talk that I gave focused on the algorithm hitter.
And that one was sort of very much in the same spirit
and just trying to highlight sort of hidden relationships.
So like the two probably most useful are one that Sean Parent points out
that whenever you have
a find if, so like this is a code review thing.
If you ever have a std find if in your code, think about is the underlying data sorted?
And if so, you can upgrade your linear find if to a log n binary search, either in the
form of a lower bound or upper bound.
And then there's the binary search.
And there's also two other algorithms that people don't think about very often,
which is equal range, which is also a binary search that just is like upper bound and lower bound at the same time.
And then the fifth, like completely unheard of binary search partition point,
which is actually an extremely useful algorithm.
And like two times now at work, I've had a colleague come to me and
say, Hey, is there like a binary search algorithm for this where you need a binary sort of predicate
that's looking for two elements that like, you know, their part, the range is partitioned,
and you're looking for the point where one of the elements returns true based on a predicate,
and the one to the right of it returns false. And it's interesting, I just learned the other day
that in the C++++ standard they have a
section for binary search algorithms but they only list uh four of them they put partition point like
next to partition in like a mutating section um which i find very interesting um anyways and then
there was a couple other things i pointed out um the other relationship is the std sort versus std partial sort versus std nth element.
So this is algorithm intuition part two that you're discussing, right?
Yeah, the technical title is, yeah, better algorithm intuition.
Okay.
And yeah, that code review is about if you have a sort, ask yourself, do you need to
sort the whole range?
If not, you might be able to save some run time by doing a partial sort and then if you're doing a partial sort and you don't actually need
those elements that you're sorting like to get the top end of them if you don't need those in
sorted order uh switch that to an nth element and now you have a linear runtime algorithm um
which i half the time forget about nth element. It's a very useful algorithm with a cryptic name
that can be very useful in certain cases.
It kind of sounds like your job is often becoming people saying,
hey, Connor, is there an algorithm for this?
I'm not sure yet.
It's my job description, but it happens from time to time at work.
And it's so fun.
I don't know.
Different things turn different people's clocks.
But yeah, I love staring at a for loop and then realizing, holy smokes, this is just
in a, you know, oh yeah, partial sum.
That's the, how can you forget that?
That's the last C++11 numeric algorithm because it's it's really
should be called scan um but yeah like one time i saw a transform that just had an accumulator
that was passing it had like a mutable lambda um or yeah it was a mutable lambda that was carrying
state um and a transform that carries state is just a scan um so like every once in a while
you you see you know either for loops or
other algorithms that are doing something weird and then you realize oh there's actually like
there's an exact algorithm for this um so yeah i spend way too much of my free time um thinking
about you know uh this stuff i think we've burned almost all of our time by asking you about
algorithms now there's
Yeah, I would definitely encourage our listeners to watch more of your talks. We'll put a link in
the show notes with a list of all talks you've been given over the past few years. I did want
to maybe call back to something you mentioned about, you know, ranges and how there there's
gonna be more coming for ranges in C++23 and you're actually
working on on a paper on that is that right yeah so well very briefly uh I'll say first that um
Barry Revzin he's the primary author and has done the majority of the work for this paper so
definitely can't talk about this without um giving a shout out to him uh and then there was a small
group of people it initially was Barry myself and Tim Song um and then there was a small group of people initially it was Barry, myself, and Tim Song
and then later on
a few more people joined
so I apologize if I forget anyone
but Corrington, Christabella
Tristan Brindles
and then I think Eric Niebler
we got him to hop on a couple of calls
or at least one call to make sure
because we're basing this on all of his range V3 work he's busy with executors and parallel algorithms right now um but yeah so
a bunch of us were meeting and and uh with the help of you know barry sort of driving most of
the authoring we put together a proposal to what we think are the top priority uh range adapters
and views um that we want to see in C++23.
So I won't go through them all,
but the ones that people are probably most going to be excited about
are Enumerate, which is sort of, it's known by different things.
It's from Rust and Python, it's Enumerate,
which is basically bundling an index with your range
that with them with C++ 17 um structure bindings you can
immediately destructure um so like how many times have you used a range for loop from c++ 11 but
also needed the index for something right um so like this will avoid the anti-pattern of having
to like declare an index outside your range base 4 or using one of just the old index base four loops.
Zip is also being proposed, and the equivalent of zip width.
So zip just zips them into a tuple together.
Zip transform applies a binary operation to these zipped elements so that you just end up with a single range.
And I'll say we're hoping to get uh formatting so like stood format support
for ranges which will be huge currently right now there's no real easy way other than setting up
your own for loop to sort of print out a range um and then we're also hoping to get ranges too
which is going to be super super important for um converting a range back into like a vector
or something like that um so oh so, Oh, ranges underscore two,
uh,
ranges,
colon,
colon T O.
Yeah.
Yeah.
Well,
what,
what happened when Eric Niebler first joined one of the calls and someone
said,
Oh yeah,
ranges too.
It's the most important.
He thought that was like ranges with the numeric too,
as if we were already abandoning the current design and proposing a second
version.
And he said,
Oh,
did we already get,
is it already that old? Um, but no, yeah, that he said, oh, did we already get, is it already that old?
But no, yeah, that's ranges colon colon two.
It's a conversion mechanism from ranges
to insert whatever data structure you want.
Yeah, I think it was proposed for C++20,
but it just didn't make the cut
because a lot of stuff, as you know,
went into C++20.
Okay.
Well, Connor, it's been great
having you on the show again today.
Where can people find you online?
I'm at
code underscore report on Twitter.
And I think, yeah, if you're interested
in any of the stuff we talked about in the show,
I just recommend checking out the
show notes, I'm sure. All the links will be
there. And your YouTube channel.
Oh, yeah. Yeah, I don't plug that enough.
I have a YouTube channel, too. Feel free to check that out and yeah feel free to uh listen to bryce and i definitely is
not going to be as polished as cpp cast um and uh yeah maybe we'll choose sides maybe tlb hit
we'll we'll choose a side of either cpp chat or cpp cast and then we can have like a, a team battle or something.
A showdown of some sort.
Yeah.
Podcast off.
Nice.
Yeah.
A podcast off at some next in-person conference.
Okay.
Thanks Bryce.
Oh,
I'm sorry.
That's a compliment for me.
It's all for Bryce.
That's keep it in the show.
It's all good.
Twitter's going to have a field day with it.
That's awesome.
Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast.
Please let us know if we're discussing the stuff you're interested in,
or if you have a suggestion for a topic, we'd love to hear about that too.
You can email all your thoughts to feedback at cppcast.com.
We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter.
You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank
all our patrons who help support the show through Patreon. If you'd like to support us on Patreon,
you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com.
Theme music for this episode was provided by podcastthemes.com.