CppCast - Concepts and Algorithm Intuition

Starting point is 00:00:00 Episode 274 of CppCast with guest Connor Hoekstra, recorded November 19th, 2020. Sponsor of this episode of CppCast is the PVS Studio team. The team promotes regular usage of static code analysis and the PVS Studio Static Analysis Tool. And by JetBrains, the maker of smart IDEs and tools like IntelliJ, PyCharm, and ReSharper. To help you become a C++ guru, they've got CLion, an Intelligent IDE, and ReSharper C++, a smart extension for Visual Studio. Exclusively for CppCast,

Starting point is 00:00:32 JetBrains is offering a 25% discount on yearly individual licenses on both of these C++ tools, which applies to new purchases and renewals alike. Use the coupon code JETBRAINS for CppCast during checkout at jetbrains.com to take advantage of this deal. In this episode, we talk about new boost libraries

Starting point is 00:01:05 and the virtual ISO plenary meeting. Then we talk to Connor Hoekstra. Connor shares his thoughts about concepts versus type classes and much more. Welcome to episode 274 of CppCast, the first podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today? I'm all right, Rob. How are you doing? Doing good. Don't have too much to share myself.

Starting point is 00:01:55 We're getting close to Thanksgiving under these strange times. Well, I believe our guest today already had thanksgiving actually it's true it is celebrated in other countries differently i guess right yeah yeah uh we'll get to him in a moment then but first uh at the top of our episode i'd like to read a piece of feedback um so we had uh theresa johnson on a couple weeks ago talking about thin lto and i think we may have asked the question at one point like if she knew what msvc's linker was like right and uh we got an email from kenny kerr we found on the show a while ago and uh he wrote hope you're well uh i wanted to point out that following a recent show on ThinLTO, someone was internally asking about Visual C++

Starting point is 00:02:46 and asking when it would catch up. He says he hasn't listened to the show yet, but wanted to let us know that Visual Studio has had multithreaded compilation and multithreaded linktime code generation for years. And if anything, ThinLTO is more like how linktime code generation works today in Visual Studio. And he says the Windows builds rely heavily on this.

Starting point is 00:03:07 Otherwise, it would basically take forever to compile. And yeah, maybe we'll have someone to talk about the Visual Studio linker in more detail sometime. Yeah, I don't think we've ever really talked about that. I feel like the open source linkers get a lot more of the conversation than the Visual Studio ones, which a lot of our listeners use. Yeah, but it's certainly good to know that they have this link time code generation feature option. So if an LTO sounded great and wonderful to you,

Starting point is 00:03:35 Visual Studio has similar support. Right. Okay, we'd love to hear your thoughts about the show. You can always reach out to us on Facebook, Twitter, or email us at feedback at cppcast.com. And don't forget to leave us a review on iTunes or subscribe on YouTube. Joining us today is Conor Hoekstra.

Starting point is 00:03:51 Conor is a Senior Library Software Engineer at NVIDIA working on the Rapids team. He is extremely passionate about programming languages, algorithms, and beautiful code. He is the founder and organizer of the Programming Languages Virtual Meetup, and he has a YouTube channel. He also recently announced at Meeting C++ and C++ Russia that he is starting a podcast with Bryce Adelstein-Lelbach

Starting point is 00:04:10 called the Algorithms Plus Data Structures Equals Programs Podcast. Connor, welcome back to the show. Thanks for having me. This is super exciting. And this is another freaking competitor to our podcast, Rob. So to be clear, we have our own competitor we don't compete with cpp chat or cpp cast our our i didn't declare these our competitors this is bryce's doing but there's another podcast uh that's been kicked off by jeff bastion called tlb uh h or tlb hits but yeah domain is tlbh.it that i think hannah dusakova helped them set up

Starting point is 00:04:47 um and yeah so bryce bryce's one sort of talking point to me was like as long as we're better than jf's podcast i'll be happy so we're apparently competing with them not with you um good to know good to know so have you released any episodes yet? we have recorded our first episode and I'm in the midst of editing it and setting up all the podcasting things but by the time your listeners are listening to this it should be online and if people want to check it out

Starting point is 00:05:15 I'm sure the information will be in the show notes I'd just say as we've told many other people consistency is the key if you do one episode and then never release another one again you probably won't get a lot of listeners. Yeah, ours, the goal for me is for it to be a very low overhead podcast. So it's inspired by a podcast called Magic Readalong, which is just two guys that sort of hop on a phone call for 20 minutes. And like half the time, they're not talking about anything

Starting point is 00:05:45 programming related. And then every once in a while they'll just drop into category theory. So half the time you're hearing about what they're eating for dinner or barbecue. And the other times, you know, you're hearing some interesting things about adjunctions and, you know, catamorphins and stuff like that. So hopefully it's just going to be that, you know, little, very little editing. And with that low overhead, hopefully it'll be easy to pump them out, you know, once a week, like you guys do. Yeah. Sounds cool. And sorry, but happy Canadian Thanksgiving. What was that? Two weeks, two weeks ago. Uh, it was back in October. So yes, the real Thanksgiving, the Canadian Thanksgiving, um, uh about a month earlier depends because

Starting point is 00:06:25 it fluctuates from year to year for for us i'm not sure if it's the same for you for american thanksgiving it's the it's always a third thursday thursday yeah or whatever yeah something like that right yeah it should be interesting to to see or hear how many people are visiting or how many people are just celebrating individually yeah yeah we're getting yeah we're getting a pretty significant phase of real closures in colorado right now leading up to this um so yeah we'll see what happens yeah okay well connor we got a couple news articles to discuss uh feel free to comment on any of these, and we'll start talking more about what you've been up to lately. Sounds good.

Starting point is 00:07:08 Okay, so this first one is a new library. It's on GitHub, and it's called Butano, a modern C++ high-level engine for the Game Boy Advance. We've talked about emulators a lot in the past, but I've never heard of Game Boy emulators. But this one looks pretty neat. It's not a Game Boy emulator. It's a C++ library for writing Game Boy games.

Starting point is 00:07:33 Okay, okay. I misread that then. So do you need an emulator or is a Game Boy Advance something you can kind of still easily get and put a game onto? I would say you can easily get a game boy advance getting a game onto it is slightly harder you have to get one of the hobbyist cartridges that lets you put an sd card in it or write write your own rom right um but if you use the this tool to program and then launch in an emulator that would would by far be the easiest way to play with this. Right.

Starting point is 00:08:07 Very cool. I discussed this with a couple of people that came up at some point recently. It might've been at meetings. He plus possibly hanging out. I'm not sure. And someone's like, Oh, well that's,

Starting point is 00:08:17 that's too new for Jason basically because Game Boy, Game Boy advanced is like an arm seven processor. So it's actually got wide support with compilers, with GCC and Clang, which helps a lot for this. I'm not trying to downplay the achievement or anything, because something I've actually thought about doing was, like, how hard would it be to bootstrap some C++ code on a Game Boy Advance, and then I gave up shortly after I came up with the idea.

Starting point is 00:08:42 So I didn't mess with it again. Do you do any Game Boy Advance programming, Connor? I do not know. I didn't even have a Game Boy. I was in school when they came out. But yeah, it was typically at recess. You'd have four or five kids in class that had a Game Boy and then everyone would be huddled around them

Starting point is 00:09:02 while they were playing their Pokemon Blue or Red or whatever those games were called. I did not play Pokemon either, so when the AR game came out and everyone was freaking out, I didn't really have the context of catching them all when I was a kid. I agree. I never played either. Did you get in on that, Rob? No, I didn't have a Game Boy as a kid either.

Starting point is 00:09:27 No, but did you get in on the Pokemon thing? No, no. I've never been into Pokemon. I think I watched some of the cartoons as a kid, but never played the games. Okay. I have an anecdote about Pokemon Go. I had worked with an intern once in my first job that had written a small program to send him texts whenever there was a Pokemon that was within a 10 minute walking radius that he currently didn't have in his whatever library. And so every once in a while, I'd see this intern jump up from his desk and bolt and say, I'll be back. And he'd be running out to get some rare Pokemon that popped up a couple blocks away, which I just thought was fantastic that there was this Pokemon Go API you could hook into to do stuff like that.

Starting point is 00:10:15 There's still people who are really into that game. My wife and I went to the park a few weeks ago and saw all these people heads down on their phones. And there was a Pokemon Go meetup going on at the park yeah i think it's fantastic gets uh gets people out doing more exercisey things yeah yeah true okay uh next thing we have we haven't talked about boost in a while but uh we have this article uh that version 1.75 is coming out soon, I think. It says in progress. I'm not quite sure what that means in terms of their release schedule. But they are adding a few new libraries with this upcoming release.

Starting point is 00:10:54 So there's going to be a new Boost JSON library, obviously for JSON parsing and serialization. And there's also Leaf, which is a lightweight error handling library, and PFR, which I feel like we've heard of leaf before. It sounds familiar. I don't know. No, I don't remember.

Starting point is 00:11:11 As usual, I got distracted by the first thing and went down a little rabbit hole looking through the source code of the boost JSON library and looking at how their allocator support works because allocators are something that I've been thinking about lately. And then I forgot to come back to this particular article and look at the other two libraries. Well, how does the Boost.json library source code look, Neo? It looks very boosty. There's lots of macros for boost things

Starting point is 00:11:36 and boost things used in the boost library. But other than that, it looks like it has fairly straightforward allocator support for passing in your own memory resource and using your own kind of allocators with it which is the part that I was actually interested in Connor do you have a chance to look at any of these? Yeah I took a glance, the one thing that caught my eye was the

Starting point is 00:11:59 addition to the MP11 metaprogramming library so Barry Revzin had requested an algorithm that ended up getting named pairwise fold, which is the sort of equivalent, the metaprogramming equivalent of adjacent difference with a different name. So it takes adjacent elements implies and applies a binary operation to each pair of elements. So interestingly, because Barry actually messaged me a couple weeks ago about this when it got added. And then I was like, why did you call it pairwise fold?

Starting point is 00:12:37 And then he didn't actually realize the name that it had ended up with. I think he initially requested ziptail-width. So zipwidth is an algorithm from Haskell that takes two ranges and applies a binary operation. And ziptail is a trick where you do that sort of the first one, zero to n-1 elements, and the second to n elements, which gives you adjacent pairs when you zip those two ranges. So ziptail-width is sort of the combination of zip tail and zip width.

Starting point is 00:13:07 And then they threw around a couple of different names like pairwise. I could get this wrong. I think it's from Kotlin or some other language that is the equivalent of sort of like adjacent elements. But they ended up adding fold to it and it's not a fold. So when Barry realized that he was like, was like oh no that wasn't one of the names we actually considered we considered pairwise but somehow fold got added on to it so um that's an interesting story so mp11 is a compile time algorithms right metaprogramming

Starting point is 00:13:39 compile time i believe so yeah so i mean without looking at this that that implies to me that it's taking like two sets of types and doing a pairwise operation on the types yeah i think potentially that is that is one one thing you can do with it okay but not a fold yeah i mean it's interesting like the naming history on these things like in in the textbook that i'm working my way through sick p the structure and interpretation of computer programs they call a map, so like our equivalent of a stood transform, that takes a variadic number of sequences. So our stood transform can take one range or two ranges. But hypothetically, you could have a stood transform that takes any number of sequences. And then instead of a unary operation or a binary operation to transform those it can just take an n-ary operation um uh so a lot of languages they just call this map

Starting point is 00:14:30 python uh and closure and racket so lisp dialects and python do that in the structure interpretation of computer programs they call it accumulate n um which coming from like c++ land is very confusing because you would think that like typically underscore n for us means uh on a subset of the first n elements of your range and accumulate is our our name for reduction um but when you think about it like when you're when you're transforming uh n like n elements from n sequences at a time down to one uh it's not really you're not folding one by one over the elements but you could argue that that is in a sense a reduction um and then so they're sort of basically saying it's a you're accumulating over n of those elements anyways

Starting point is 00:15:17 like there's there's a bit of a reduction a bit of a transform and every language sort of makes different decisions so uh naming is hard as they say right it's hard okay uh last article we have is a trip report a virtual trip report i guess from uh herb sutter and this is about the autumn iso c++ stands meeting which was held virtually and we briefly talked about this a few weeks ago when we had John Heed and Aaron and Peter on. And yeah, it's great to hear that they were able to actually vote on some papers for C++23 during this virtual meeting. And as I just mentioned, John Heed,

Starting point is 00:15:57 he was actually, his paper was the first one to get voted into C++23, which is pretty exciting. He has his paper PO330, which is going to add a UZ literal suffix to the language for numbers of size T type. Yes, I mean, I needed this suffix like four times on Monday, basically. I'm like, static cast, static cast. I'm like, static cast, static cast. All right, like, yeah,

Starting point is 00:16:26 I'm looking forward to this. Although it makes me wonder if there's UZ for unsigned size T, I mean size T, which is unsigned. I knew there had been some discussion about like an SZ or something also for signed size T, which is in places in the standard also it just makes me wonder if there's a signed and an unsigned version or not but i didn't read the paper to find out connor you have any info on that are we getting both signed and unsigned versions of this ah it looks like uz and z sorry oh okay yeah it's good to see um the committee's still making progress i know there are differing camps on um whether c++ 23 we should still be aiming for it to there's a paper a while ago a bold plan for c++ 23 if we should still aim for that or some sort of arguing we should slow down a bit due to covid times and

Starting point is 00:17:19 make this sort of more of a c++ 14 style just you know just polishing features that went into 20. But yeah, most programming languages, whether it's designed by committee or sort of designed by a core group of developers, and you can submit, you know, peps, whether that's for Python or Jepps for Java, a lot of these languages already, you know, even before COVID, have their development cycle on GitHub and online. So I'm really happy to see that C++ is still making progress and slowly sort of transitioning to that model. I should mention just a couple of other things that they did vote into c++ 23 we're also going to get a uh string contains function which is another one that uh i think we should all be excited about that we'll finally

Starting point is 00:18:11 have so you can just say if string contains whatever as a part of as opposed to having to you know write that by hand or something like that the find it does not equal and yeah yeah um and also did you note that richard smith is no longer going to be the lead editor of the standard did not see that thank you to richard smith for his work for many years as project editor for the c++ standard and completing c++ 20 this month oh yeah um starting now we begin c++ 23 with thomas coop who has graciously agreed to step up as the primary project editor while richard will stay on as backup project editor cool good for him we should have either richard or thomas on we've never had either of them on no we have not yeah were you involved at all uh in this meeting connor um i was uh briefly an attendee i had multiple meetings

Starting point is 00:19:01 going on at that time um okay um okay. Um, but I have been meeting with, uh, a small sort of algorithm slash ranges group for the last few months. Um, sort of working on the paper that ended up going into the most recent mailing. And we'll discuss that in a minute, I believe. Okay. Okay. Well, why don't we start talking more about, uh, what you've been up to lately, Connor? I know we just had Meeting C++ 2020 last week, and you gave a new talk there. Is that right?

Starting point is 00:19:30 Yep. Yeah, I attended not all of the conference, but it was, I think, over three days. And I think we all saw, or I definitely saw Jason at a couple tables. And were you there too, Rob? I can't... Yeah, Jason and I both did a Q&A

Starting point is 00:19:46 and hung around for a bit after that. Oh, yeah, that's right. Yeah, I caught the second half of your Q&A because I was in and out of work meetings. Yeah. How did you guys find Meeting C++ online? I didn't stick around for much more of the conference after our Q&A, unfortunately.

Starting point is 00:20:01 I just had too much other stuff going on to be an attendee this year. Yeah, I also, I mean, I hung out for a while and as always, it's good to see people even if remotely who we haven't seen in a while. Um, and it was fun doing the AMA. I think, uh, I'd do that again. Um, I think it went well. Yeah. Yeah. Yeah. I guess the time zones are a bit hard to for us cause it starts i think 2 a.m locally for me which means even earlier for for you yeah it would be like midnight that it was starting for me which so yeah not not ideal times um no not ideal we're gonna talk to jens about that what was he thinking yeah really well so what was your uh your talk this year, meaning C++? So this year, I gave a talk called C++ concepts versus Rust traits versus Swift protocols versus Haskell type classes or some permutation of those language features.

Starting point is 00:20:57 I don't recall the exact order. And this was a talk that over a period of like two years in my head of sort of watching different videos and reading different content, I had this realization that the C++20 concepts facility that we were getting has existed in other languages for decades or something very similar and has been added to more modern languages like Rust and Swift. The first time, so I mentioned this all in the talk, I'll give like an abridged version of it, but the first time I sort of realized this was in 2018 when I started learning Haskell. I learned about type classes, which basically let you specify the class of a generic type, and then basically enables you to do certain operations with that generic type. So when you have a generic function in Haskell that just as a single input takes a generic type A and returns as output a generic type A, you can really only do one thing with that function at haskell

Starting point is 00:22:07 which is return it because you don't know that you can do anything else with the generic type so that's a function in functional programming called identity and i explicitly remember learning about this on another podcast called lambda cast which i actually started listening to because you mentioned it once uh in in your sort of introductory news items in one of your episodes. And they sort of talked about this identity function. And it sounds like the most useless thing. Like, why would you ever want a function that does nothing and just returns you back what you have? But if you start delving into the functional programming world, it actually is very useful in certain cases. One of them, for instance, is it doesn't exist in the STL, but in the CUDA equivalent of the STL, Thrust, we have algorithms that take basically,

Starting point is 00:22:59 in some cases, a third range, which is known as a stencil. It's very, very similar to projections in C++20 range algorithms. And so you can do something like a copy if that takes two ranges where it copies elements from the first range based if the unary predicate returns true when applied to the second range. So in a std copy in the STL, you always have to apply the unary predicate to the range that you are specifying because you're only allowed to specify one. But an alternate design for that algorithm could be taking two ranges, the second one which you don't copy but you're just checking. So it's sort of similar if anyone out there is familiar with Excel to a function in Excel called sumif if it basically sums up elements in a certain range based on another range. It's the exact same idea. But sometimes, for a copy,

Starting point is 00:23:53 if you're just copying, but if you have some equivalent of like a transform, the thrust identity might be actually what you want, in that case, or stood identity, which we're getting in C++20. Anyways, back to Haskell, that generic A to a generic A, or std identity, which we're getting in C++20. Anyways, back to Haskell, that generic A to a generic A, it's called identity. And this the type classes give you the ability to say, well, my generic type A is in the type class of num, which sort of means stands for numeric. And now you can do more with it. You can add them together. You can do things that you can do with numeric generic types. And then ultimately in the talk, I sort of show the Swift protocols and Rust traits. I also include D type constraints. And sort of the points of my talk that I really try to drive home is that in this broad category

Starting point is 00:24:39 of what in sort of academia is referred to as constrained parametric polymorphism, there's two categories, the concept model, which I sort of term type constraints, and then the type class model, which I just sort of stole the name from Haskell. And so Rust, Swift, and Haskell fit into the type class model, and D type constraints and C++ concepts fit into the type constraint model. And so the difference is that where in Haskell or the type class model, you start with a generic function that you can't do anything with, and then you have to specify the type classes in order to do more things with it. In the concept model, it's the exact opposite. You start with a generic function in C++, you know, now you can just go auto F and with a parameter, you know, auto P and that returns, you know, an auto P.

Starting point is 00:25:33 So it's a generic function that, you know, takes a single argument and has a single output. But you can do absolutely anything with this. It might not compile, but you don't need to do any extra specifications in order to say, you know, oh, this can work with some, you know, it's got a dot name method or a dot area. So the example in my talk I show is having a shape concept and then like a circle class and a rectangle class that adhere to this concept. In C++, you don't need to specify that shape concept in order to get the code to work. You start with the world, and then you constrain with a concept what you're able to do with that generic type. Anyway, so they're sort of completely opposite to each other. And I find it very interesting that in certain languages like Swift, they refer to this parametric polymorphism as constrained generics when really,

Starting point is 00:26:27 so when they say unconstrained generics, that's a generic function that doesn't have any constraints on it, which you can do nothing with. And then by adding the constraints, you can do more, which sort of seems like they have the terminology backwards. And I've talked to a few people and they've shared like motivations for, you know, potentially why they did it that way. But I think that's, it's a key thing to sort of understanding that in C++, we start with the world. And by adding concepts, you are basically restricting what you can do with your function. But in many other languages, it's the exact opposite, you start with a generic function that does very little, and then you enable it to do more by adding the constraints to your generic type. So that's sort of in a nutshell what the

Starting point is 00:27:09 talk's about. Interesting. You get good feedback, good questions from the audience? Yeah, there was a few questions at the end. It was originally a 90-minute talk that I sort of cut 20% out of, and it was supposed to be a 60-minute talk So it was pretty, pretty quick cadence. But afterwards, we hopped into sort of a remote Q&A. One of the questions I got was how come I didn't compare these language features to Java and their interfaces. C sharp also has interfaces go also has a language facility called interfaces, but it's slightly different than what Java and C sharp have. Definitely, you can do a comparison between those. But I can't cover every language out there in a 60 minute talk. And I said, my primary motivation was that I was trying to choose languages that were adjacent to C++. And that operate sort of in the same space. So definitely Rust, you know, a lot of people say in the same breath as C++. D, although a lot less popular, is extremely influenced by C++. Swift is a very, very interesting language that I knew absolutely nothing about going into sort of preparing for the talk. And it is a it is a very, very pleasant,

Starting point is 00:28:25 pleasant language to use. And I wish I wish they were focused more on sort of cross platform and not just like mobile app development for iOS. Because it's it like when you start off with Swift, it feels like Python. And Chris Latner, the designer of the language, he focused on one of the values for the languages that he want for the language Swift that he wanted was something called progressive disclosure of complexity, otherwise sometimes referred to as he wanted the language to be infinitely hackable, so that like it feels like Python at first, which it truly does. Like you open up the playground online, you type print parentheses, hello world and parentheses, like exactly like you would in Python, and it works. But slowly, as you start to learn the language, you can sort of peel back the layers.

Starting point is 00:29:10 And it's it's an amazing experience, like coming from the world of C++, where we don't necessarily have the best, like now, thanks to gobble, I think it's a lot easier to go and just start typing away and getting started with C++. But like historically, C++ has not had a very good sort of like first impression experience if someone's not holding your hand and you very quickly, like I remember the first time in high school trying to get, where was it, university, it was a while ago when I was trying to get started with C++ and it was just awful. You had to go download things and get them all set up and then nothing worked for three hours. And then finally, I think I just ended up

Starting point is 00:29:48 getting Visual Studio working, which was the easiest way. But yeah, there's a long way to go, I think, in C++ in terms of our like first impression experience that we have. I kind of feel like collectively, we lost a lot in the 80s to 90s transition where every computer you turned it on and immediately had a programming environment. I mean, it was basic, but you had a programming environment as soon as you turned it on that you had to do some programming to use it.

Starting point is 00:30:16 Or you could very easily, if you chose to, just type things in and see what happened. And yeah, now you're like, you know, that doesn't exist so much in the same way anymore. Yeah, it's part of the reason I think why Python, you know, there's many reasons you can stipulate why Python became so popular. But with basically like every Linux distro, it's just automatically bundled. You just type Python and poof, you hop down into the whatever command line editor or interpreter. And yeah, I think, I think for any language that can get like automatically included in a bunch of operating systems, that's going to give you a big headwind. Yeah, I feel like for those of us who are maybe

Starting point is 00:30:55 slightly older, it feels like Python completely replaced Perl somehow, because Perl's, I believe, actually part of the POSIX standard. So you knew Per Pearl was always going to be there. And now I don't know. The one programmer that I knew who chose to do Pearl for all time is now no longer doing Pearl after like 15 years or whatever it was that he programmed in Pearl. Yeah, Pearl. I feel bad for Pearl. Like I only ever spent like half a week of my career working in Perl. I had to debug Perl. And so I don't know much about it. All that. Other than that, like everyone says it's one of, you know, the write once, read never, impossible to read languages.

Starting point is 00:31:38 But like I actually, I think Perl is one of those languages that like if you write it well, which I think the Perl code, I wasn't an expert, but it seemed nice. And I really liked the single character, although this is coming from someone whose favorite language is APL. I was just going to say that. The single character notation for this is a hash map, this is an array. I love that stuff. And if you think about it, Python is somewhat similar to that. When you go like variable equals braces, that's a dictionary. When you go variable equals brackets, that's a list.

Starting point is 00:32:12 You have to go equal set for a hash set. But I don't know. I think Python developers know what that means. When they see L equals bracket bracket, they know that that's a list, which is somewhat similar to, you know, what Perl was relying on. So I don't know, I feel all these languages that get bad raps, I just feel bad for them. We just need to we need to love all our programming languages, even if they've fallen out of favor. Like, this, this programming language needs a home. Would someone be willing to adopt it?

Starting point is 00:32:47 Yeah. I want to wrap the discussion for just a moment to bring a word from our sponsor, PVS Studio. The company develops the PVS Studio Static Code Analyzer, designed to detect errors in code of programs written in C, C++, C Sharp, and Java. The tool is a paid B2B solution, but there are various options for its free licensing for developers of open projects, Microsoft MVPs, students, and Java. The tool is a paid B2B solution, but there are various options for its free licensing for developers of open projects, Microsoft MVPs, students, and others. The analyzer is actively

Starting point is 00:33:10 developing. New diagnostics appear regularly, along with expanding integration opportunities. As an example, PVS Studio has recently posted an article on their site covering the analysis of pull requests in Azure DevOps using self-hosted agents. Check out the link to the article in the podcast description. Going back to your talk on concepts and comparing it with these features of other languages, as someone who has studied and worked with all these languages, do you have a preference for the constrained way concepts work versus the type classes where you have to start adding to the type classes in order to be able to use it?

Starting point is 00:33:50 Yeah, my personal preference is the type class model. But this is like one of the classic sort of tradeoffs where people fall into two different camps. There are people that like it's's the same argument in my opinion as like dynamic versus static typing. There are certain language facilities or features that require sort of more upfront thinking. And the trade-off is that you're going to have static analyzers

Starting point is 00:34:20 that are going to be able to tell you more and you have more sort of sureness of whether your code's going to be like correct tell you more and you have more sort of, um, uh, sureness of whether your code's going to be like correct at the time you run it. Whereas, you know, with other languages like Python and JavaScript, you can just sort of write it, it'll run and you'll find out at runtime, uh, or like a month later if it works. Um, and I think there's use case, there's times when you just need to prototype and get some, you know, proof of concept out the door. Um, and there's other times where like being sure that something's running correctly is more important. I tend to always fall on the side of like, I would prefer static typing so that, you know,

Starting point is 00:34:55 my compiler can do more for me. I would prefer to like specify the concepts on front and like have the compiler force me to think about like, what do I actually want to enable my generic type to have? Whereas in the, in the C plus plus model, we now, and this is also sort of a legacy thing, right? Like we, we are building this on top of templates, you know, potentially they would have designed it, um, in the type class way if they could have. Um, and so I, I have no like qualms with the design. I think it's probably a necessity. Um, but if, if I was starting from scratch, I think like I prefer the model that, maybe let's spend a little bit more time thinking up front. And let's get more guarantees later. Because I would, I would prefer to spend more time thinking than more time like debugging later, is my personal preference. But like I said, like,

Starting point is 00:35:42 I understand the arguments for both. And there are some times when like, I've been trying to do something in language x. And I'm just like, let me just try this in Python. And then like, it was 45 minutes wasted. And then five minutes in Python, I have it working. And so and so yeah, there's there's a time and a place, I think, for for different languages. So if you don't mind, if we go back in time a little bit here, shortly, I think after we had you on the first time, you ended up giving the most awarded C++ talk ever, I believe, if we were to quantify these things when you gave your algorithm intuition talk at C++ Now, which was well regarded. And we discussed with Tony how jilted he felt because he didn't win any awards that year. But you know, that's all fine. That's history.

Starting point is 00:36:27 Since then, I mean, you've developed, you've done several talks now on algorithms and algorithm intuition, and it's a huge topic, but something we need to talk about because it's been so well accepted talks. What kind of things do you talk about in those talks? What's your favorite talk? Well, so the first thing I should say, because, cause I know now that you've mentioned Tony's

Starting point is 00:36:49 name, I'm going to hear about this from him. Uh, I don't think Tony, uh, was actually at that, uh, C plus plus now 2019. Um, so I had heard from him that he was a bit upset, uh, that, you know, first of all, I think a couple awards were added that year, like Best New Speaker, which I won. And then he said that that's not really fair. Like, you can't just add awards and then say, you know, because his claim was that he would have won that stuff in the past. Anyways, Tony was all set to come back in 2020, and apparently he was going to reclaim his throne. So the dust hasn't settled on that i don't think it is all in the past i think it's in the future and we will see

Starting point is 00:37:30 the return of tony um at whenever c++ now happens again so um and i look i'm a huge uh he he emailed me once um when i i joined the canadian national body and he said my nemesis and i was like what like i'd never interacted with tony before and he's like two of my favorite talks are words of wisdom and post-modern c++ um so but anyways it's it's all good fun um what do you actually won like eight awards for that talk or something i don't i don't recall oh come on you have a stack of them i know you do well no so john uh john called the organizer he says that there are uh plaques or something um but i think he's got like a a few year like leg on sending those out so he said um but yes it was it was a big surprise uh the talk was very well received um and i yeah it was

Starting point is 00:38:27 i'm glad that people enjoyed it so much um it was it was also my first talk too so i i was very um i was very surprised like i was expecting like feedback and like oh you know try this next time or try this uh you know or tweak it slightly differently. But yeah, so the the idea for the talk initiated from a combination of two things. It was watching Sean parents C++ seasoning talk from going native 2013, which probably like I've said, I mentioned in like every podcast or talk that I've ever given. Because I think it's it's it for me, at least it was such an inspiring talk. And it's it's Yeah, it's like 50% of the reason that I ended up giving the talk.

Starting point is 00:39:09 And then I stole the title from a previous CppCast episode. I think it was the first time you had Kate Gregory on episode 30, where she had like a little thing she said in the midst of the episode was that, you know, we spend a lot of time focusing on data structures, but not as much time focusing on algorithms.

Starting point is 00:39:28 And she said this quote that was like, for people that, you know, we need to develop algorithm intuition or like an intuition for algorithms, I think was her exact words. And that in 10 years, people that haven't done this are going to be trailing behind sort of other folks. And as a listener, like you hear that and you're like, oh, no, like FOMO, like, I'm going to be left behind my employers no longer going to employ me if I don't know these algorithms. So I slowly started trying to learn them. And then I sort of just went on this journey of having sort of insights about like, oh, this algorithm is similar to this algorithm. so like one of probably like my biggest eureka moments in the first talk um was when i was trying to solve this problem that i won't go into the details but the gist of it is like you have two ranges and you're trying to uh index into both of

Starting point is 00:40:18 them and then do some kind of um either mapping operation or reduction operation um in this case i think it specifically was you were taking two ranges and then binding them together and ending up with a single value. So there's sort of like a zip in there, and then a transform and then a reduction. And in Python, they have algorithm called zip, which we do not have in C++ currently, although that is in the planned C++23 range algorithms to be proposed. So we will get it in the future. But at the time, we didn't. And that meant that there was no way, at least that I could see at the moment, to solve it with an algorithm. So I had sort of a for loop for my C++ solution. And then for Python, I had a zip with sort of destructuring. So you're

Starting point is 00:41:06 zipping two ranges together using like the equivalent in Python of C++ 17 structure bindings, and then using a algorithm combined with sort of a generator expression to do it in like a single or two lines, which I thought was like super nice. And then two weeks after I had sort of made a YouTube video about how, you know, Python's solution was much nicer than the C++ solution, I realized that we actually did have an algorithm that implicitly was doing a zip, and I just had made had didn't know about it. And that algorithm was inner product. And then sort of in algorithm intuition, or the second follow up on better algorithm tuition, I had a whole category of like, I think, five different algorithms

Starting point is 00:41:49 that implicitly have zips in them. So to pause on inner product, you can think of an algorithm that takes two ranges and a binary operation and applies that binary operation to each element at the same index at a time um that is essentially like a zipping algorithm we just don't in any of these algorithms name them you know zip underscore something um so inner products the first one i'm not sure if either of you two want to guess what the other uh four are or we can we can play a game you can guess and then our listeners can guess as well what's terrible is i've watched this talk and i can't tell you so the more the more general version uh of inner product is stood transform so the equivalent of zip with currently in like the pre c++ 23 or C++20 range algorithms is std transform that takes two ranges.

Starting point is 00:42:48 So like std transform is your sort of map from functional programming. When it takes one range, it's a map. When it takes two ranges, it's a zip width. And then inner product is basically like a, it's a std transform also with a reduction after it. So like a std transform takes with a reduction after it so like a stood uh a stood transform takes just a single uh lambda or function object um that either applies to one range or two ranges stood inner product or as it was renamed in c++ 17 stood transform or to reduce uh takes two

Starting point is 00:43:19 lambdas or two function objects one of them does the transform and then one of them does the reduction wait wait are you saying inner product and transform reduce are the same algorithm? Not identical, but basically stood transform reduce is the parallel, renamed version of inner product. Yes. Okay. And like inner product, I know many, I think Matt Gobble mentioned it in one of his talks that he gave at a vast meetup that Hanna Dusakova runs. Matt Gobble gave a talk called, I think, C++, the old or the new old language or something like that. And then also Ivan Trukic, who wrote the Functional Programming in C++ book, and he's given a number of talks. He mentions it in one of his as well, that like inner product is a really bad name. And I think Alexander Stepanov chose it because he has a strong math background.

Starting point is 00:44:16 And this algorithm is called inner product in like the world of mathematics. But really like as a generic algorithm, it's a bad name for it, because inner product is like a domain specific name, when really what you're doing is taking two ranges, you're transforming them into a single range with a binary operation, and then you're reducing it down to a single element with another binary option, binary op in the form of what they call a reducer. And I guess I should note also that the C++17 std transform reduce is also different in that it has an overload that takes just a single range. So inner product, you need two ranges. But std transform reduce, similar to std transform,

Starting point is 00:45:01 you can give it either a single range that it first does a transformation on and then a reduction on, or you can give it the two ranges similar to std inner product. And the reason for that is that it seems like why would you ever want a std transform or a std reduce and do the transformation in the reduction lambda or function object. But the thing with the C++17 parallel versions, the transform reduce and reduce, is that there are strict requirements on these algorithms because they can be performed in parallel. So the binary operations that you provide to it have to be both commutative and associative,

Starting point is 00:45:49 which means that you are very restricted. So for example, if you wanted to do a parallel, you wanted to calculate the total length of strings in a vector of strings. In pre-seep, so you've got a vector of strings in a vector of strings um in pre-seep so you've got you know a vector of strings that's got you know a dog cat elephant and mouse and i don't know what the total it's roughly like 10 or 15 characters you know dog and cat are three characters and the idea is you want to get the total length of all the strings in in pre c plus 11 or c plus plus, like parallel algorithm world, you could just do a std accumulate

Starting point is 00:46:25 that basically has an accumulator that's going to be the running sum of the total lengths. And then inside, you're going to go return of your lambda. You're going to go return accumulator plus string.size. So like in the arguments of your binary operation, that it's a reduction, the first element's your accumulator, which is going to be like an integer or a size.

Starting point is 00:46:46 And the second parameter is going to be a string. So that, when you have a binary operation where the types are not the same, it's definitely not commutative and associative. So in order to do this in parallel, what you would need to do is use a std transform reduce, where the transform first turns the strings into lengths of either you know size t or integer and then the reduction binary operation is now going to be

Starting point is 00:47:13 accumulator of size t and then also an element of size t so you do the transform first so that you can have a commutative and associative reduction binary operation. Hopefully I didn't just put all the listeners to sleep. Well, it helps from my perspective that I had actually the, once you started talking about transform reduce, I brought up the CBP reference page with the, the overloads that are available. So I was glancing down at that and I was making sure I was following along. Yeah.

Starting point is 00:47:44 So hopefully if uh if you're on a peloton you can switch your your tree uh your park that you're in to the the cpp reference and if you're on a run i apologize um this actually is very apropos uh for me because just on monday tuesday i was prototyping some code and i'm like i know there's an algorithm i should be using here but i needed to do an offset over the same range and with this to discussion about transform reduce and accumulate i'm thinking oh if i just swap my reasoning a little bit i could just simply apply an algorithm to it it'll all fall out how i want it to so i'm gonna have to take another look at that later so coincidentally uh for what you just explained inner product and transform reduce is exactly what you want for that yeah um

Starting point is 00:48:36 so like one one of the common uh when someone once came up to me after a talk and we're like oh here's a problem you can't solve with an algorithm, which was like, basically, bring it on. I prefer the more like, hey, is there an algorithm? The more like, hey, let's try and solve this together. But I also don't mind challenges. But it's the, you know, given a array of elements, calculate like windowed sums. So if you've got a list of 20 elements, calculate, you know, the sum of five elements at a time, from like first zero to four, then one to five, then two to six, etc. And like very quickly, if your ranges and windows are big, you can get quadratic time complexity,

Starting point is 00:49:19 if you're doing the sum sort of repeatedly, because you're having overlapping windows so ideally you don't need to like recalculate the overlapping part and so typically if you're just doing it with a for loop you calculate the sum of the first five and then for the next one you subtract the first element off and then add the next element so you're just doing subtracting one number adding the next number and then you're gonna get something that runs in linear time you can do this by first doing a stood accumulate on the first five or n elements that you need in your window size and then do a transform reduce where basically uh you're carrying along some state of the previous element and you

Starting point is 00:49:55 start one range at the beginning or at the beginning and one range uh at the sort of end of your window or so i don't even think you need to carry state because your first iterator is going to point to the element you want to drop and then the next iterator is going to point to the one you want to add right um so you can actually do it it's a little bit you're twisting the algorithms in a way that you know typically you're when you use inner product or transform reduce you're iterating over two different ranges over like the same sort of index um but yeah you you can twist algorithms to do all kinds of crazy things. Well, it sounds very much like what you would need for like a moving average

Starting point is 00:50:30 window. Like when I'm looking at my YouTube stats, and I'm looking at my 30 day moving average. I mean, that's basically the same thing. Yep, a lot of a lot of like finance applications and programming language that are specific to finance, like have those algorithms built in because you can do them extremely efficiently. And if you're not careful, you can do them extremely inefficiently. But a lot of them just have it built in. And like, I know Q is a an array programming language that's used by a lot of hedge funds. And they in built into it have average and ma VG, which is for moving average, because that's stuff they use all the time for like ticker symbols and stock prices and whatnot. I think it's worth noting before we move past

Starting point is 00:51:09 this also that every algorithm we've just discussed are the ones that are actually hiding in the numeric header, not in the algorithm header. So if you go to look for them... So yeah, this was the sort of difference between my first talk, which I aimed to be on the algorithm header, but then I ended up falling in love with all the algorithms in the numeric header, which is sort of what I talked about. So inner product, accumulate, adjacent difference, iota, and I'm missing one. What's the fourth one? Inner product.

Starting point is 00:51:44 Oh, well, someone's going to tell us online. And then the second talk that I gave focused on the algorithm hitter. And that one was sort of very much in the same spirit and just trying to highlight sort of hidden relationships. So like the two probably most useful are one that Sean Parent points out that whenever you have a find if, so like this is a code review thing. If you ever have a std find if in your code, think about is the underlying data sorted?

Starting point is 00:52:14 And if so, you can upgrade your linear find if to a log n binary search, either in the form of a lower bound or upper bound. And then there's the binary search. And there's also two other algorithms that people don't think about very often, which is equal range, which is also a binary search that just is like upper bound and lower bound at the same time. And then the fifth, like completely unheard of binary search partition point, which is actually an extremely useful algorithm. And like two times now at work, I've had a colleague come to me and

Starting point is 00:52:45 say, Hey, is there like a binary search algorithm for this where you need a binary sort of predicate that's looking for two elements that like, you know, their part, the range is partitioned, and you're looking for the point where one of the elements returns true based on a predicate, and the one to the right of it returns false. And it's interesting, I just learned the other day that in the C++++ standard they have a section for binary search algorithms but they only list uh four of them they put partition point like next to partition in like a mutating section um which i find very interesting um anyways and then there was a couple other things i pointed out um the other relationship is the std sort versus std partial sort versus std nth element.

Starting point is 00:53:29 So this is algorithm intuition part two that you're discussing, right? Yeah, the technical title is, yeah, better algorithm intuition. Okay. And yeah, that code review is about if you have a sort, ask yourself, do you need to sort the whole range? If not, you might be able to save some run time by doing a partial sort and then if you're doing a partial sort and you don't actually need those elements that you're sorting like to get the top end of them if you don't need those in sorted order uh switch that to an nth element and now you have a linear runtime algorithm um

Starting point is 00:53:59 which i half the time forget about nth element. It's a very useful algorithm with a cryptic name that can be very useful in certain cases. It kind of sounds like your job is often becoming people saying, hey, Connor, is there an algorithm for this? I'm not sure yet. It's my job description, but it happens from time to time at work. And it's so fun. I don't know.

Starting point is 00:54:26 Different things turn different people's clocks. But yeah, I love staring at a for loop and then realizing, holy smokes, this is just in a, you know, oh yeah, partial sum. That's the, how can you forget that? That's the last C++11 numeric algorithm because it's it's really should be called scan um but yeah like one time i saw a transform that just had an accumulator that was passing it had like a mutable lambda um or yeah it was a mutable lambda that was carrying state um and a transform that carries state is just a scan um so like every once in a while

Starting point is 00:55:03 you you see you know either for loops or other algorithms that are doing something weird and then you realize oh there's actually like there's an exact algorithm for this um so yeah i spend way too much of my free time um thinking about you know uh this stuff i think we've burned almost all of our time by asking you about algorithms now there's Yeah, I would definitely encourage our listeners to watch more of your talks. We'll put a link in the show notes with a list of all talks you've been given over the past few years. I did want to maybe call back to something you mentioned about, you know, ranges and how there there's

Starting point is 00:55:42 gonna be more coming for ranges in C++23 and you're actually working on on a paper on that is that right yeah so well very briefly uh I'll say first that um Barry Revzin he's the primary author and has done the majority of the work for this paper so definitely can't talk about this without um giving a shout out to him uh and then there was a small group of people it initially was Barry myself and Tim Song um and then there was a small group of people initially it was Barry, myself, and Tim Song and then later on a few more people joined so I apologize if I forget anyone

Starting point is 00:56:11 but Corrington, Christabella Tristan Brindles and then I think Eric Niebler we got him to hop on a couple of calls or at least one call to make sure because we're basing this on all of his range V3 work he's busy with executors and parallel algorithms right now um but yeah so a bunch of us were meeting and and uh with the help of you know barry sort of driving most of the authoring we put together a proposal to what we think are the top priority uh range adapters

Starting point is 00:56:41 and views um that we want to see in C++23. So I won't go through them all, but the ones that people are probably most going to be excited about are Enumerate, which is sort of, it's known by different things. It's from Rust and Python, it's Enumerate, which is basically bundling an index with your range that with them with C++ 17 um structure bindings you can immediately destructure um so like how many times have you used a range for loop from c++ 11 but

Starting point is 00:57:12 also needed the index for something right um so like this will avoid the anti-pattern of having to like declare an index outside your range base 4 or using one of just the old index base four loops. Zip is also being proposed, and the equivalent of zip width. So zip just zips them into a tuple together. Zip transform applies a binary operation to these zipped elements so that you just end up with a single range. And I'll say we're hoping to get uh formatting so like stood format support for ranges which will be huge currently right now there's no real easy way other than setting up your own for loop to sort of print out a range um and then we're also hoping to get ranges too

Starting point is 00:57:56 which is going to be super super important for um converting a range back into like a vector or something like that um so oh so, Oh, ranges underscore two, uh, ranges, colon, colon T O. Yeah. Yeah.

Starting point is 00:58:09 Well, what, what happened when Eric Niebler first joined one of the calls and someone said, Oh yeah, ranges too. It's the most important. He thought that was like ranges with the numeric too,

Starting point is 00:58:18 as if we were already abandoning the current design and proposing a second version. And he said, Oh, did we already get, is it already that old? Um, but no, yeah, that he said, oh, did we already get, is it already that old? But no, yeah, that's ranges colon colon two. It's a conversion mechanism from ranges

Starting point is 00:58:31 to insert whatever data structure you want. Yeah, I think it was proposed for C++20, but it just didn't make the cut because a lot of stuff, as you know, went into C++20. Okay. Well, Connor, it's been great having you on the show again today.

Starting point is 00:58:45 Where can people find you online? I'm at code underscore report on Twitter. And I think, yeah, if you're interested in any of the stuff we talked about in the show, I just recommend checking out the show notes, I'm sure. All the links will be there. And your YouTube channel.

Starting point is 00:59:02 Oh, yeah. Yeah, I don't plug that enough. I have a YouTube channel, too. Feel free to check that out and yeah feel free to uh listen to bryce and i definitely is not going to be as polished as cpp cast um and uh yeah maybe we'll choose sides maybe tlb hit we'll we'll choose a side of either cpp chat or cpp cast and then we can have like a, a team battle or something. A showdown of some sort. Yeah. Podcast off. Nice.

Starting point is 00:59:30 Yeah. A podcast off at some next in-person conference. Okay. Thanks Bryce. Oh, I'm sorry. That's a compliment for me. It's all for Bryce.

Starting point is 00:59:42 That's keep it in the show. It's all good. Twitter's going to have a field day with it. That's awesome. Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, we'd love to hear about that too.

Starting point is 01:00:02 You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help support the show through Patreon. If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode was provided by podcastthemes.com.

CppCast - Concepts and Algorithm Intuition

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.