CppCast - Efficient Programming with Components

Episode Date: August 19, 2021

Rob and Jason are joined by Justin Meiners. They first talk about a big boost library update, and whether Valgrind is still useful compared to sanitizers. Then they talk to Justin Meiners about Alex S...tepanov, his contribution to the STL and some of his courses that are still relevant to today's C++ programmers. News Boost v1.77.0 Intel C/C++ compilers complete adoption of LLVM Valgrind - A neglected tool from the shadows or a serious debugging tool? Links Efficient Programming with Components - YouTube Playlist Efficient Programming with Components - Notes by Justin Meiners Merge with fewer comparisons (using goto) Sponsors C++ Builder

Transcript
Discussion (0)
Starting point is 00:00:00 Episode 313 of CppCast with guest Justin Miners recorded August 18th, 2021. This episode is sponsored by C++ Builder, a full-featured C++ IDE for building Windows apps five times faster than with other IDEs. That's because of the rich visual frameworks and expansive libraries. Prototyping, developing, and shipping are easy
Starting point is 00:00:21 with C++ Builder. Start for free at Embarcadero.com. In this episode, we discuss a Boost Library update. Then we talk to Justin Miners. Justin talks to us about the teachings of Alex Stefanov and what we can still learn from him today. Welcome to episode 313 of CBPCast, the first podcast for C++ developers by C++ developers. I'm your host rob irving joined by my co-host jason turner jason how you doing today i'm all right rob how are you doing doing
Starting point is 00:01:30 good uh just a few more days until uh school restarts here in north carolina so getting a little anxious about that uh anything going on with you no i uh school in my area apparently has already restarted because i have to jog past all the middle schoolers waiting for the school bus. Now I'm out exercising in the morning, which is a little awkward. Especially when I'm doing laps around the park or something. Although I will say I did get a comment from a friend yesterday that they really appreciated how I tried to turn last week's episode around on you and ask you more questions about yourself. Well, on that note, I do have this piece of feedback. I was reading through the comments and read it from last week's episode. And
Starting point is 00:02:16 yeah, I got there a couple people kind of echoing that they're happy to hear some more about what I do in my day job. And just a couple of things to comment on. I was wrong about C Sharp and GoTo. C Sharp does have GoTo, but it's not used the way it's not used in C++ these days. And then this other commenter was saying, interesting episode. We finally get to hear a bit more about Rob and his work. And C++ CLI gets mentioned about a half hour in. I recently tried some things
Starting point is 00:02:48 with C++ 20 and got the error that C++ CLI mode does not support C++ versions newer than C++ 17. I'd be interested to hear whether Rob agrees that C++ CLI is a dying technology. And I'm just going to echo what this other commenter replied, that no, I definitely
Starting point is 00:03:04 don't think it is. It's a shame that you can't use some of the C++20 features with C++ CLI. But in the.NET space, you know,.NET framework has been around for years and had new versions come out every few years. But now the focus has been on.NET Core, which is like multi-platform.NET. And C++ CLI support was added in.NET Core 3.1. The only thing is it is still limited to Windows because.NET Core is multi-platform. You can use C Sharp on Linux and Mac now. But the C++ CLI, unfortunately, is still limited to Windows. I personally hope that changes.
Starting point is 00:03:42 I mean, doesn't it sound like Microsoft is deprecating it if they're not going to support C++ 20 with C++ 11? But they went through the effort of bringing it into.NET Core. So the first three versions of.NET Core did not have it, but then they went and added it
Starting point is 00:03:59 in.NET Core 3.1. So if it never got added there, I would agree with you that, yeah, it's a dying thing. If you want to keep using it, you can only use it with.NET Framework. But they did go through the effort of adding it, so I don't think it's a dangerous technology. Yeah. All right.
Starting point is 00:04:14 Well, we'd love to hear your thoughts about the show. You can always reach out to us on Facebook, Twitter, or email us at feedback at cppcast.com. And don't forget to leave us a review on iTunes or or subscribe on youtube joining us today is justin miners justin has been a professional programmer since he was 16 his areas of interest are 3d graphics and unix systems in addition to c++ he enjoys c lisp and swift he holds an ms in mathematics where he studied topology and computer algebra and besides single work he enjoys studying ancient Greek philosophy. Justin, welcome to the show.
Starting point is 00:04:47 Thanks. I'm excited to be talking with you guys today. I'm kind of curious what you were being paid to program on at 16 years old. Yeah, that's a good question. So at the time, iPhone apps were really popular. And through some experiences with some other friends, I had gotten into Cocoa and Objective-C programming. And so pretty much everybody was looking to hire iOS programmers. And I was going to a user group at the time. And I sent out a message where I said, you know, these are some skills that I have, but I'm sort of 16 years old. Is anyone interested in hiring me?
Starting point is 00:05:20 And quite a few companies actually responded. So I interviewed a bunch. These weren't people that were in my family or anything like that. So yeah, they hired me and I started working in high school. That's cool. So this is just part-time, but you are still actively going through high school and everything? That's pretty cool. Yeah, it was part-time and then full-time in the summer. I just loved it.
Starting point is 00:05:36 It seemed at that point that high school was really boring and I wasn't learning anything later. I did actually go to college, as you can see. So it didn't completely scare me away. But for a while, I thought, oh, you know, the school stuff's kind of a waste of time. Kind of makes me think of the mid-90s kind of era, too, with like the real start of the.com era. Yes, thank you, Rob. Where if you could, you know, write a few lines of HTML, you could probably find a job at the time. Yeah.
Starting point is 00:06:06 To my credit, it was a little better than just writing a few lines of HTML. But yeah, it was a great experience. I don't think it would be quite the same today to be able to do that. That's pretty cool. Those early iPhone and Android app days were wild. I still remember the... What was the name of the app or just showed like a pretty red gem and they charged you like nine hundred dollars for it i don't know max was right
Starting point is 00:06:30 i did have a co-worker who was like jumped immediately on board with the iphone thing as soon as it came out he like he actually printed out a mock-up of one at work just to see and like folded like the paper craft at work to see just how big it was going to actually be when it was going to be released. And it ended up being laid off, but just jumped headfirst into iOS development. And his entire career has been iOS development and consulting and speaking and whatever since then. Yeah, for quite a few years after, I would get random acquaintances, either from my neighborhood or through my friend group, who would be like, hey, can you develop an iPhone app for me? And I had to have some screener questions of like, okay,
Starting point is 00:07:16 how are you going to support this thing? Like, what kind of budget are you looking at? And, you know, very few projects got through that filter. They're like, well, I've got $10 and I would like you to make a copy of Facebook by tomorrow. Yep. All right. Well, Justin, we got a couple news articles to discuss.
Starting point is 00:07:35 Feel free to comment on any of these and we'll start talking more about this course of Alex Stepanov, okay? Okay. All right. So this first one is a version update to Boost. This is version 1.77, and a couple new libraries being added. First one is Describe, a C++14 reflection library,
Starting point is 00:07:57 and also Lambda 2. Both of those are from Peter Dimov. And then some pretty big updates to some of the existing Boost libraries. ASIO has a pretty long list of improvements, as well as File System, which is now going to be File System v4, which sounds like it's going to be closer to Stood File System. Yeah, I was reading this stuff, and I'm thinking... Well, a couple of things.
Starting point is 00:08:22 First one that you glossed over at the top is this new idea of a basic any which is a simplified any version any type that you can configure for how big it's small object optimization is which i think could be really useful but i was thinking about asio and file system now the c++ 23 networking or whatever is based around ASIO largely, right? Right. And boost file system is what led to standard file system. And I'm looking at how these things, like the boost versions of these things are not staying stagnant, right? These are moving targets. They're growing and developing in a way that the standard library ones just simply can't sure and it made
Starting point is 00:09:10 me kind of come back around to this question of well do these large library components belong in the standard yeah i mean boot boost will always be out there and as you're saying boost is uh continuing to grow as well as other you know large libraries similar to it. I think there's probably a lot of people in the standards committee who would agree with you that these things just don't belong. It'd be easier to do without them. I don't know. Justin, what are your thoughts on this? Well, I think Boost is well known as the thing that you reach to second as soon as the standard library is too small for your needs. So I think that there's value to having it there, especially as we can see that they're able to update it at this speed
Starting point is 00:09:48 and make this many changes. And so I think that it's good that it's a separate project. But yeah, I think that they just have a good reputation for being as high quality or robust as the standard or close to it. I guess that's one way to look at it is once we get networking in, if the standard networking or the standard file system doesn't quite meet your needs, then it should be relatively easy to just swap over to the boost versions of those things. Right.
Starting point is 00:10:14 Yeah, that seems to be the whole point of this std file system or boost file system v4, that they're trying to make it easier to go between std file system and boost file system. I actually do that in a couple projects, and it hasn't been... We've only hit a couple of pain points. It's cool that they're helping along. Even just small extensions, I remember
Starting point is 00:10:35 for a while when they did smart pointers, it was like the boost has a few different smart pointers that didn't make the standard. They replaced the old ones with the ones that were standard and then just had their extension ones. Right. Okay, next thing we have is a post from Intel, and this is that Intel C++ compilers have completely adopted LLVM.
Starting point is 00:10:58 I don't think we've talked about the Intel C++ compilers in a while, but they're still out there. Obviously, they're getting lots of updates, and this is a pretty big change for them to go from their own tooling to using Clang. Yeah, so I've always heard about these compilers, but I've never obviously used an Intel compiler. Do you know who is the market for those specific ones
Starting point is 00:11:23 as opposed to Visual Studio or LVM? Generally, scientific computing, generally. The same people who are going to buy the Intel Fortran compiler because it's highly optimized for Intel processors. So if you know that you've got a giant Intel-based cluster and you're trying to optimize your scientific application, you'll drop some money on their compiler so you can get things highly tuned for your cluster.
Starting point is 00:11:48 I see. Focus on parallelism as well. Right, yes, right. I see. So the Intel engineers can get in there and they understand the processor in a way that perhaps the LLVM people don't, and so then they can generate all this code
Starting point is 00:12:00 that specifically uses a processor. That's right. Yeah. Yeah. I do find this interesting, you know, all this code that specifically uses a processor. That's right. Yeah. Yeah. I, I do find this interesting, you know, coming back to Patricia's trials and tribulations with trying to make her own
Starting point is 00:12:14 fork of Chrome that she can keep up to date. And I wonder if the, so the large argument from the Intel team here is that, you know, starting from LLVM, they're starting from a more modern code base and it'll be easier for them to keep it up to date and whatever. And I found them wondering, I found myself wondering here if the Intel team is going to find it actually harder to maintain a fork of LLVM than it was to maintain their own compiler. That's a good point. I think that these sorts of decisions just speak to how expensive it is to develop these kinds of
Starting point is 00:12:52 major products. Like C++ compilers at this point are just so complicated and have so many years of investment in them that just keeping up with LLVM is just really expensive. And similarly, you can see similar parallels with Chrome and JavaScript compilers
Starting point is 00:13:07 and that sort of thing. I think we should feel a little sad because in the last few years, we've watched the Borland compiler completely go away and be replaced with LLVM. The Intel compiler's gone away and replaced with LLVM. We have three compilers left.
Starting point is 00:13:20 One of the main strengths of the C++ ecosystem is that we have more than one implementation of our compiler. Let's keep supporting the three that we have left. I doubt GCC or Microsoft Visual C++ are going to go anywhere anytime soon. I fully agree GCC is not going anywhere. If Microsoft, now don't anyone start any rumors here or anything, but if Microsoft in the next five years were to announce
Starting point is 00:13:51 that they're going to stop supporting their own compiler chain and just use LLVM also, that would not shock me. I guess they do have some client integration already, yeah. And they dropped their own browser, so they've only got Chrome now. I mean, we've got effectively only two web browser engines left also. And that's, I feel like kind of critical to the ecosystem to have more than one rendering engine or something as important as the web. So download Firefox. For sure.
Starting point is 00:14:21 This is kind of a difference in ecosystem between like C programmers, where it's like there are all these hobby C compilers are specialized. You could imagine a team of two or three people making a production C compiler, just not going to happen with C++. No, not not. I mean, you might be able to pull it off in a couple of years and have something that's good, but then keeping it up to date is a full time job. That better be what your plan is. Yeah. Yeah. All right. And this last post we have is a blog-time job. That better be what your plan is. Yeah. Yeah. All right. And this last post we have is a blog post about Valgrind. Is it a neglected tool from the shadows or a serious debugging tool?
Starting point is 00:14:55 I'm talking about, you know, some of the differences in what you can detect with Valgrind versus the sanitizers. And yeah, there's definitely still some valid uses for Valgrind, right, Jason? I absolutely think so. I definitely use it. Like they said, if I can't recompile the project
Starting point is 00:15:11 or depending on what exactly I'm trying to measure, because Valgrind, since it works as an emulator, it can give you an exact count of how many instructions were executed. And there's no way to do that with outside of Valgrind because anything like OPR for whatever is like a sampling profiler. And it can tell you approximately how many instructions were executed when your program was run.
Starting point is 00:15:35 So depending on what you're trying to collect and what you're trying to do, I absolutely use it still. I thought it was interesting in this article to explain a little bit about how Valgrind works in that it essentially reads the instructions out of the binary and then emulates those on its own virtual machine. And that's pretty cool. I hadn't really thought about how Valgrind worked before in the past, so that was interesting. Yeah, that's why it's 100 times slower than using the sanitizer. Yeah. Yeah, and that is the thing the article points out, that that is the main downside to Valgrind versus sanitizer, that you're going to have a huge amount of memory overhead and the program's going to really just be slow,
Starting point is 00:16:13 very, very slow to run. But it'll give you lots of good info. Rob, since you've, like, focused on Windows ecosystem, I think, for most of your career, right, have you ever used Valgrind? A long time ago. It's been quite a long time, but I do wish there was something more like Valgrind on Windows. There is a tool called Dr. Memory, which is not
Starting point is 00:16:36 too dissimilar, that works on all three major platforms. Have you ever tried that one? I don't think we've ever brought it up on the show. Not that I recall. You should play with that and report back next week. Maybe, maybe. Okay, well, Justin, we're going to talk about Alexander Stepanov today. So for those listeners who aren't familiar with Alex, could you tell us a little bit about him and some of his contributions to C++?
Starting point is 00:17:07 Sure. So Alex is best known for being the creator of STL, along with Dave Musser and Ming Li. And you might wonder, you know, why is this worth talking about 25 years later? And I think that that comes to mind because we typically think of libraries as just a set of common calls for doing something. You know, Python standard library, you have, you know, here's how to parse JSON. Here's how to set up an HTTP server. Here's how to do an HTTP client. So I'm sure we've all written libraries at our time in our careers. You know, why is this of an ecosystem and an embodiment of the theoretical work that Alex had done over his career of a new style of programming, a new way to think about programming.
Starting point is 00:17:58 And so it's really like this great intellectual achievement about thinking about the questions of like, what are programs? How do we write code? And so it's had a tremendous effect on the ecosystem. And I think lots of other languages now as well, like Swift. So just a little bit about Alex from a biographical perspective. So he's a Russian programmer, studied math in the Soviet Union. He moved to America, worked for a lot of different research roles at large companies at places like HP, Adobe, A9, SGI, AT&T. For a while, he was also a professor at Polytechnic University, which is now part of NYU. And he describes that time as being really interested in the foundations of programming. He said that he studied computer science first by reading Aristotle, by trying to understand the categories of reality to sort of model those on a computer. And a lot of this research showed up in some work in Ada.
Starting point is 00:18:54 He's done some work in Scheme and Common Lisp. And a lot of that work was done with Dave Musser. But this has been a topic that he's been interested in for a long time. And so after he had done a lot of that work in Ada, he had the opportunity to write this library for C++. That was sort of going to be the library that handles containers and algorithms and all that sort of thing. And so how do you actually,
Starting point is 00:19:17 it's defining how you write algorithms in the language, how do you write reusable code? And so he went through that process. He wrote the STL. He worked with the standards committee to get it standardized. He describes this as like a very stressful process to work with the standard and just how difficult it is to get things with the committee and how a lot of things that he understands really well or that's based on a lot of research they didn't understand or that sort of thing. So he gets the STL standardized.
Starting point is 00:19:47 Most of his code, you know, he wrote almost all the code that was in the original STL that was distributed on all the major platforms on Apple, Microsoft, and Linux. Now, of course, those have changed over time. But, you know, as soon as he gets it standardized, he sort of stops being closely involved in the standard committee because it's just so stressful for him. And so a lot of people don't really know about him.
Starting point is 00:20:12 If they hear the name, it sounds like he's just a crazy old guy, but he did this important work for us. And the other thing that he's done since then that we're sure most people know is he's written two books. The first one is Elements of Programming, and the second one is from Math Generic Programming. And these sort of explore these theoretical ideas behind the STL.
Starting point is 00:20:30 Besides that, what I find really interesting about Alex is that he's not just a programmer. Alex is really a programmer who is interested in a lot of intellectual areas. He's interested in philosophy. He's interested in history. He's interested in politics. He's interested in philosophy, he's interested in history, he's interested in politics, he's interested in math. And all of that understanding he brings into programming, which seems a little weird, maybe seems a little bit different than what most people do, but he does it in a way that's really compelling. And so he has deep reasons for everything that he does, even as simple as, should we have the less than operator be the
Starting point is 00:21:03 primary one or the greater one? Well, when you were a kid, you learned to count up from one to three. And the only other time you count down is something like you're launching a missile or a rocket or something like that. So the right thing to do must be to count up. And so that's kind of the way that he approaches a lot of things. And so a lot of people that are in our community, the C++ community, there are lots of really smart people, lots of really hardworking people, lots of people who have contributed a lot. But I think Alex is special in a way that he is a serious thinker that's worthy of study. But if you want to understand programming on a deeper level,
Starting point is 00:21:37 I think that he's somebody that we're going to study years into the future. So I'm curious, you said when he was designing the STL, or when he was doing this early research, that it wasn't object-oriented programming. It was a new way of thinking about programming. So would you say that the STL isn't object-oriented? It isn't functional or procedural? What is it? Yeah, so the school of thought or the paradigm of programming that's ascribed to his style is called generic programming.
Starting point is 00:22:04 And this term is a little bit misleading because people hear generic programming and they say, oh, I've used generics. I know what that is. But no, it's actually a lot more, and I'll talk a little bit more about exactly what that is. But yeah, I would say that the STL is not object-oriented programming in that it's not about building inheritance trees. It's not about building inheritance trees. It's not about creating polymorphic interfaces. It's about encoding algorithms in generic ways that can be reusable and describing the mathematical interface required for them to work.
Starting point is 00:22:37 So you asked also about functional programming. There's certainly a lot of elements that are functional in that they're inspired from Lisp and they follow the principle of taking inputs and producing outputs without side effect. But it's distinguished from functional programming in that all these algorithms rely on state heavily internally. And Alex talks a lot about this, how functions are great for math, they're great to do on paper, but we have actual computers in front of us that we need to write programs for. And it's silly to ignore how the actual program works in order to write code. So there are lots of good things we can learn from functional programming.
Starting point is 00:23:14 We can learn about mathematical induction. We can learn about recursion. We can learn about, like I said, these avoiding side effects. But a lot of the algorithms that we implement, they're machines. They have state. They set flags. They do all kinds of messy things inside. Okay. So Alex did this course, Efficient Programming with Components. And the reason we're talking to you today is because you decided to write these detailed notes about the course.
Starting point is 00:23:40 Do you want to tell us more about the course and what prompted you to write up all these notes? Yes. So these notes were a course given at A9 as they were teaching their own employees about how to better write code, how to better understand the STL. And specifically, it focuses on how do we write reusable code? How do we write components that we can reuse in different ways and the way he teaches that is by writing a vertical slice of the stl so you start out with very basic functions how do we write min how do we write max and he works all the way up to more complicated things like stable sort and so it's really insightful to see both how much thought goes into each of these algorithms you may say oh, oh, min and max, you know, that's a waste of time. I'm not going to learn about it. Well, he says the max and the standard is actually wrong.
Starting point is 00:24:30 He got it wrong. So there's all these details that you need to work through. And so you can see his thought process in designing it. But then also you just gain a greater appreciation of how the STL works and a lot of understanding. And there's two main concepts going on in the course. So he's talking about algorithms. How do we write algorithms in a reusable way? What are the kinds of problems that we need to think about?
Starting point is 00:24:53 How do we wrap them up in interfaces as well as concepts? And most people have heard of concepts as the C++20 feature, but concepts have been something that's been around for a long time. If you go read the original SGI STL documentation, I believe it's still on the Boost website. It talks about concepts. It says, here's what a forward iterator is. Here's what a regular type is. Here's what a semi-regular type is. And that's what this course is all about, is defining those fundamental concepts and then using those to write algorithms. And the reason why I wrote these notes is I actually came across this course when I was younger and I didn't quite get it. I watched
Starting point is 00:25:31 through the videos and there's lots of interesting things going on. He taught me a little bit about constructors, about value types, about caches, but it was mostly just, oh, this is interesting. It seems like there's a lot more here, but I don't really understand all the things that are going on. So I came back to it years later, and I'd studied a lot more computer science, studied a lot more math, ontology, common LISP. And watching it that second time, it was like there was so much richness in all the things that he was talking about. It's linked to so many different areas. There's so much depth in all that he's talking about. And so I wanted to provide a way to share this experience because the videos are a little bit tedious to watch. You have to deal with people interrupting and asking questions and, you know, they'll actually
Starting point is 00:26:13 like make mistakes. And then a few courses later, they have to come back and correct their mistake. So I wanted to correct all that, get it nice in text, almost as a historical record of what he actually said because there's so many interesting things in there but then also a lot of the material is covered in his books some of the theoretical material but you don't get it integrated with this history with these jokes with these uh you know sort of like absurd statements all of that like needs to be part of the experience to understand what alex is. And so the books are a little bit lacking in this area. So that's another reason why I thought the course was important. Was this like available on YouTube or something, the original course? It is. Yeah, you can find it on the A9 YouTube channel. And
Starting point is 00:26:57 there are actually a few other courses that are similar. Some of them cover a lot of the same material. Others are more about his math, generic programming style of teaching. But this is the one that I feel is the most important as far as understanding what Alex is about. And so then you've also published your notes separately, is that? That's correct. And my understanding is that that will be linked to in the show notes. Yeah, I'll put show notes out with both a link to your notes and the original YouTube links. Yeah. Great.
Starting point is 00:27:27 I want to end up the discussion for just a moment to bring you a word from our sponsor, C++ Builder. The IDE of choice to build Windows applications five times faster while writing less code. It supports you through the full development lifecycle to deliver a single source code base that you simply recompile and redeploy. Featuring an enhanced Clang-based compiler, Dinkumware STL, and packages like Boost and STL2 in C++ Builder's Package Manager, and many more. Integrate with continuous build configurations quickly with MSBuild, CMake, and Ninja Support, either as a lone developer or as part of a team. Connect natively to almost 20 databases like MariaDB, Oracle, SQL Server, Postgres, and more with FireDAC's high-speed direct access.
Starting point is 00:28:06 The key value is C++ Builder's frameworks, powerful libraries that do more than other C++ tools. This includes the award-winning VCL framework for high-performance native Windows apps and the powerful FireMonkey framework for cross-platform UIs. Smart developers and agile software teams write better code faster using modern OOP practices and C++ Builder's robust, and feature-rich IDE.
Starting point is 00:28:28 Test drive the latest version at Embarcadero.com. So what algorithms or other parts of the STL did Alex talk about in this course that you found particularly interesting or useful? Yeah, so a few interesting things are the ones, the very basic algorithms that he re-evaluates. Like I said, min, max, swap. There's a lot of thought that goes into those, but there's two that caught my eye as really interesting and exciting and things that I had never thought about in this way before. So the first is one that's called, it's called like a balanced reduce. And we're probably familiar with a left fold or reduce where you have a range of elements and you have a binary operation and you want to combine all those elements using that binary operation.
Starting point is 00:29:17 Like an accumulate or something, basically. Yeah, yeah, that's right. And the standard is called accumulate. And so, you know, you might apply plus to a range of integers to add them all up. And what that does is it sort of starts from the left and combines it with the element next to it and merges those up. And then it continues on to the third element and merges those up and goes all the way down the list. And what this algorithm that Alex comes up with does is it changes the order of the evaluation so that things are combined only when they've been combined with an equal number of other things. So what that kind of means is you like divide the range in half and you merge up
Starting point is 00:29:57 the first elements and then you take the second half and you merge those and then you merge those and so on up and building this this tree of association instead of like a linear uh instead of a linear association and it does this in a way that's really efficient it uses like binary counting the same way you learn to add binary digits and it has so many applications so the ones that he goes through, you can write merge short with this. The operation that you apply to the, you know, you take a list of lists. And the operation that you apply is something that can merge to sorted lists. And so you just do that. And it does merge short.
Starting point is 00:30:37 And it does it perfectly balanced, you know, where it's split evenly. You can use it to add up large amounts of floating point numbers. So typically doing an accumulate for floating point numbers is bad because as you add up the list, you eventually get a really big number and the ones that are left are really small. And if you know anything about floating point precision, those small numbers just can't be represented very well to be added to this large number.
Starting point is 00:31:04 But using a balanced reduce, you tend to add up things that are similar size. And so you can get much more accurate. You can get much more accurate additions that way. And the other application that he does with it is this finding the smallest and second smallest element in a list, which seems like a very specialized problem. But he does it in a way that's really efficient. And it's really interesting just to see another application of the balanced reduce. And so this is actually one that I've gotten interested in.
Starting point is 00:31:32 I wrote a proposal for the Swift standard library as well. We'll see how that goes. I have no idea. Have you submitted that proposal then? I submitted it on the forums, generated a little bit of discussion. People were mostly fixated on the application for adding numbers, and they said, well, we're going to maybe get this in numerics. So not a lot of people had other comments on it. So it doesn't have a proposal
Starting point is 00:31:54 number that anyone could go look up right now or anything like that? No, and the best place for it might be the algorithms package as opposed to the standard itself, because I guess that's the place where they're staging those things now in a similar way to Boost, I guess. So the other algorithm that I find really interesting is this merge sorted lists. So this is the operation that's applied to merge to lists in that merge sort, which is you have one list that's sorted, and you have another list that's sorted, we want to combine them in a way that so that the end result is sorted. And this is a linear algorithm, you can kind of walk through step by step and you sort of just keep taking the smallest element on either side until you've made the final list. But this one that he writes
Starting point is 00:32:34 is for the case when they are in the same buffer. So you have a array. The left side of the array is the one list to be merged. And the other side of the array is the other part of the list to be merged. And then we want to combine these or mesh them together so that the whole thing is merged. And he just finds this beautiful recursive solution to do it where you kind of rotate the one into place
Starting point is 00:32:58 and it's a little bit hard to describe. But it was really exciting and it's very efficient in that unlike these other ones that are linked lists, you're not building a new buffer. It's the same buffer, and it's all about these rotating and swapping elements. And so they're just really exciting to see applications of these computer science principles that we know about recursion, about induction, and seeing them play out in a beautiful way in C++. Sorry, but when you just said that you would use that they were rotating the elements in I had an immediate like Sean parent like it's a rotate like kind of flash.
Starting point is 00:33:31 Anyhow, sorry. All right. So it seems like a lot of what I learned in computer science about the efficiency of algorithms and such has just kind of been thrown away lately because it didn't take into account modern caching architectures and design of CPU such has just kind of been thrown away lately because it didn't take into account modern caching architectures and design of CPUs and that kind of thing. And I'm kind of curious from what the stuff that you've watched, like how do these algorithms hold up
Starting point is 00:33:56 on modern architecture and how do we measure and even reason about the performance of them? Yeah, so a while ago, maybe two episodes ago, you guys were talking about applications of GoTo. You remember that?
Starting point is 00:34:08 So if you're like me, sometimes I'll get the optimizer bug and I go into a problem and I say, oh, you know, I know all this stuff about caches and I know all this stuff about branch prediction and I'm going to rearrange this way and it's going to be really fast.
Starting point is 00:34:22 And so I do that and then I run it and that's obviously slower and I have no idea what i'm doing you guys have that experience uh i i don't tend to optimize to that level i tend to go hey can i make this code simpler and just let the compiler do more work that tends to be my my mode that's probably the right thing to do so alex alex is not that guy so what he does in one algorithm he has this merge shorted list once again this this algorithm i'm talking about and he looks at it and says you know we're doing one more comparison every time than we have to do so i'm going to rewrite this algorithm and i'm going to use go to to maintain the state of which side i'm
Starting point is 00:35:02 on whether i'm on the left side of the list or the right side of the list. And I'm going to remove this extra comparison. And I actually ran this. I benchmarked it because I couldn't believe it. It's a 20% to 30% speed up from him using GoTo to reduce this comparison. It's unbelievable. And so, yeah, a lot of the stuff that he talks about, almost every specific example he went to, as I do the same benchmark they do, I get the same results in that the way that he teaches is faster. So if you're sort of an intermediate level programmer, you haven't thought much about cache, about locality, about linked list type structures, about dispatching on virtual, then I think everything you're going to learn in the course is really relevant. The trends are still the same in hardware. We have slow memory to read and write
Starting point is 00:35:48 from. We have pipelining and we have fast CPUs. So those things that he teaches about that are still very relevant. But if you're going to take it to the next level of optimization, obviously you need to be up to date with the newest hardware. But I think he's right at the line of where it's still good stuff to learn. As far as performance measurement, they do a few interesting things. So obviously they have a time measurement and it's what everyone else does. You can mark, here's how many clocks I have in the CPU at the start. Here's how many clocks I have in the CPU at the end. And let's figure out the duration of an algorithm. But he also creates this class called instrumented. And this is based on
Starting point is 00:36:26 the concept of regular types. Regular types are types that behave like we expect, like integers. They have constructors. They have assignment operators. They have, oh, I'm forgetting all the things. But anyway, he writes the singleton, which wraps any regular type, and it counts how many of each operation it does. So it counts how many times you construct it. It counts how many times you copy construct it. And then you can run an algorithm, and you can see exactly how many of those operations it does. So they use both of these tools to kind of analyze the algorithm, and one is more based on the traditional computer science. How many operations are we taking? And you can normalize these things to common functions. You can look at how does it
Starting point is 00:37:09 compare to log n or something like that. But what's really interesting is that, well, first of all, it's useful because it stops your code from optimizing away the thing you're trying to benchmark. It at least has to count the operations correctly. So it won't just not do your problem, which is a common problem with benchmarks. And then another thing that it does is it allows you to have both these perspectives. Is it the duration or is it the operations? And the reason why that's useful is because
Starting point is 00:37:33 there's always this problem in benchmarking of, well, it's fast on my machine. I don't know if it'll be fast on machines in the future. And then also with generic programming, you're always writing generic code. So it's fast for float. It's fast for int, will it be fast for vector of map. And you know, you can't really get that with just the duration. So the counting operations helps here. And he gives examples of times when the one tells you the information of what's fast and the one where
Starting point is 00:38:01 it's the other meaning. One example he looks at is calling std unique versus using a set to remove all the duplicate elements in a list and if you count operations the set is actually better the set uh the set does exactly what you um expect it's using some kind of tree or hash sort of thing to pick out the elements individually. But with Unique, you have to sort the range, and then you call Unique, and it will use more operations. But yeah, it's much better as far as the cache goes, as far as the kinds of operations it's doing. So both are useful perspectives, and I think just understanding that class, that instrument class, which counts operations would be useful for a lot of programmers. I find that really fascinating, actually, because I've written that kind of instrument class probably several dozen times at this point while teaching classes, so that I can
Starting point is 00:38:55 show my students, oh, well, you know, when we do a pushback into a vector, this is what happens, we can watch the elements get moved into the new memory location or whatever. But I've never considered once using it to actually observe the efficiency of code that I wrote. I've only ever used it to observe the behavior of the standard, basically. That's really interesting to me. Yeah. And you might want to look in the code they have some nice ways to sort of format that information in tables and ways to adjust that like i said with scale so you're not looking at oh you know i can't really compare a hundred billion to a thousand it'll it'll normalize it based on one of these growth functions so you can tell how closely it matches n squared or log n that That's pretty cool. We mentioned
Starting point is 00:39:45 concepts earlier. How do you think C++20 concepts feature matches Alex's original vision of the STL? Yeah, this is kind of a tough question. But I think the best way to answer it is to
Starting point is 00:40:01 understand a little bit more about what concepts are. So like I said, concepts are not something new in C++. They're something that exists outside of C++. They're something that exists outside of programming even. They are like mathematical and logical structures. So an example of one of these are things from abstract algebra for people who are into math. We have groups. We have rings. Most programmers are probably familiar with vector space. And what a concept is, is it's something that has, it's a type. You have a type, and types have associated types,
Starting point is 00:40:36 and they have associated values, and they have associated operations. And so you have, you know, vector space has the scalars and vectors, and you can multiply scalars by vectors and you can multiply scalars by vectors. You can multiply scalars by scalars. You can add scalars. You can add vectors. And all of these operations, besides having access to them, they have restrictions on how they're supposed to behave. So, for example, the scalar addition or multiplication needs to be associative.
Starting point is 00:41:02 It's not associative. It's not the vector space. And so concepts, as Alex has talked about them, he's always described the concepts in comments that go before the code. So he'll say sort requires a forward iterator. Okay. And, you know, the forward iterator is just like one of these vector space things. It says you have a value type, you have an advanced function, and that sort of thing.
Starting point is 00:41:28 And so what the C++20 concepts allow you to do is essentially define a restriction on a type where you can list sets of constraints, and the constraints are little blocks of code that have to compile with the result that you expect in order to pass the concept. So for example, you could say a forward iterator is a concept, and it needs to have a value type member,
Starting point is 00:41:53 and that value type member needs to be a regular type or something like that. Or it needs to have an increment operation, and that increment operation needs to satisfy some other constraint. And so the fundamental thing here, I think, is type predicates, things that you can evaluate on types to ensure that they're true or false. And my understanding of the concepts in C++20 is this is the concepts light and not the more complicated concepts that they were going to do. And from what I can tell, it does this type predicate, this associated types well, and it acts as sort of like a better documentation
Starting point is 00:42:31 to be able to write things like, here's a forward iterator, and now you're going to go implement a forward iterator. The compiler is going to make sure you actually got all the things in there. And similarly, you can write an algorithm. You can say, oh, it takes a forward iterator, and it will check all those things in ways that it didn't before, whereas before you had to remember the list. So I think it's certainly a step in the vision that he had. There's probably some details that he would disagree about. In particular, as I looked
Starting point is 00:42:59 through the library itself of concepts that are in the standard library, there's not a whole lot of them, and they tend to be focused on details about C++ programming and not more of these abstract concepts, if that makes any sense. So for example, there's only one concept I could see related to ordering, which is totally ordered. And in the C++ library, there's actually algorithms that use all different kinds of ordering relationships. So perhaps there's some differences there in the C++ library, there's actually algorithms that use all different kinds of ordering relationships. So perhaps there's some differences there in the actual concepts that made it into the standard library. If you're curious more about this subject, there is one of the videos in the course, the A9 course. Bjarne actually comes to their class and walks them through an early version of concepts.
Starting point is 00:43:44 And he walks through, here's how it would look in the code. And him and Alex kind of have a back and forth about, yeah, as you can imagine, they have kind of a back and forth about different features. I didn't include that in my course notes just because it's kind of a derailment from the rest of the course. And it's about BRNA and not about Alex.
Starting point is 00:44:03 But if you're curious about that topic, that would be more where you could learn about it. Very interesting. So you mentioned in your show notes, Paul McJones, but I don't think that name has come up yet today, has it? No. So who is Paul and why should we be discussing him? Yeah. So Paul is a friend of Alex who worked with him mostly at Adobe, and he is closely involved in his Elements of Programming project. And he's a very quiet guy.
Starting point is 00:44:31 He is actually there in the Efficient Programming with Components course. He's there attending. And he's also in a lot of other lectures that Alex gives about Elements of Programming, but he'll never engage. He'll sometimes spout off some knowledge about things, but he's a very quiet guy. But he actually plays a really important role for shaping Alex's theoretical work and shaping the book Elements of Programming. In particular, you know, whenever you're doing a big writing project like this, you have to determine a sort of voice or a style of presenting the information. And what Paul really helped Alex do with elements of programming is to structure the text in the same way that the Euclid's elements is structured or traditional mathematical work
Starting point is 00:45:14 where you have definitions and axioms and theorems and all the sort of discussion text around that information. You throw it away, say all the information should be there. And if you really understand the definitions and the axioms, then you'll understand the material. And so, like I said, Alex is this big character. He has lots of opinions about things. And Paul kind of helped him to focus that into the elements of programming text in the way that we have it. And I think that that text is challenging for that reason. And a lot of people may not get a lot out of it because of that. But for people, I think it's important to frame Alex's work, have one representation of Alex's work in that same way that Paul helped him do.
Starting point is 00:45:57 And so, yeah, I think Paul studied math as well. If you want to see more about him, there's a lecture that Alex and Paul give at Stanford about the Elements of Programming book. And Paul participates in that one a little bit more and shares some opinions. So that's all I have about Paul. So we've talked about algorithm design and performance and analysis up to this point. Does Alex talk about correctness of code or safety or other concerns? Yeah. So one of the questions he brings up is how do we know code is correct or what makes code correct? And the first answer that comes to mind as engineers is we might say, well, code is correct
Starting point is 00:46:36 if it follows the specification. Now, Alex points out, you almost never have a specification unless you're implementing a standard or you're emulating something. You're the one coming up with the specification. And he also has examples of code that he would say is written incorrectly. So one example is in C, you have this B search algorithm, which is a binary search implementation. And he says it's incorrect because it doesn't return any useful information. What it returns is the value that it finds. But the whole reason why you want binary search is because you want to insert things or delete things. It's a very general operation.
Starting point is 00:47:15 So it returns the wrong thing. So it's incorrect. So he defines correctness in this way of it not only has to sort of satisfy a specification, whatever that means, but it has to do something useful. And so whenever we're writing out, we need to think about what's actually the thing that it's supposed to do, what's its job, and how is the person going to use this? And so one of the very practical things he talks about is make sure you return all the useful information. If you did work to do something to find an answer, just return that work as a separate argument. They might need that work. You already did the work.
Starting point is 00:47:53 Don't make them do it again. So a lot of the STL algorithms follow this. So, for example, the find or the delete, they'll return the iterator of where they ended up. So you can continue the find. You can continue the delete. You already did the work, you don't want to do it again. And that's an important component of reusability. As far as safety, he makes this big emphasis on preconditions, and obviously satisfying the constraints of the concepts that an algorithm requires to work. So, you know, if an algorithm says this is a forward
Starting point is 00:48:25 iterator, it doesn't need to work on things that are not forward iterators. That's not safety. Safety is not, you know, if you plug the wrong thing in, you might not get the right answer. Safety is when you satisfy the preconditions, then you're guaranteed to get the post conditions. And this is represented in a practical way in the way he writes algorithms. He'll often write a version of the algorithm that requires absurd preconditions. He does this quick sort, and it says there needs to be an element at the start of the list which is smaller than all the elements before you can run this. But then when you run this, you know, it'll sort the rest of the list. And then he'll write a version that ensures that precondition is met and then calls that algorithm and this is a style of program that i haven't seen
Starting point is 00:49:09 very many places uh but i think it's it's very powerful to let you think about assumptions that you're making and to really boil down an algorithm to its fundamental parts because i think a lot of what we do especially sort of error checking at the beginning or saying is the list empty is uh do we have all the things that we need? This is like not essential to the actual part of the algorithm. It's just making sure some preconditions are satisfied. And an advantage to that is that then when you have a case where you know the sort of memory safety of like, hey, look, computers have pointers. Computers have memory.
Starting point is 00:49:54 Computers have resources. And pretending that those things don't exist, it's often useful to be able to eliminate classes of bugs. But there is a lot, you know, ultimately you have to satisfy preconditions to get things to work. You can write bad code in any language. You can write wrong code in any language, and so we have to satisfy preconditions to get our code to work correctly.
Starting point is 00:50:16 Okay. We've, you know, talked about several things. Maybe to wrap up, is there any, like, particular lessons you got from studying Alex's work that we haven't talked about already? I would say that one important lesson besides this whole concept of generic programming, besides this correctness of code, this sort of thing, I think that the importance of performance is a lesson that I haven't seen taught in the same way. If you think about the role of writing a standard library component, as opposed to something that you use every day at
Starting point is 00:50:52 your work, you know, or it's just sort of day to day code, the standard library component is going to be used forever. We've been stuck with these things for 25 years. And you never know what kind of thing people are going to plug into them. For my master's thesis, I did some work on something called braid groups. And I had a less than comparison operation, which takes about five seconds on my data set to evaluate this lesson comparison. And so you being able to use an algorithm, which counts every single less than operation and ensures that it's minimal is actually really valuable for the kind of work that I was doing. And I think that that importance of performance and fundamental code, I get that not every piece of code is written that way. Maybe in your in your project, you're using 80 percent STL, 20 percent your own internal library.
Starting point is 00:51:43 You know, you really want to make that 20% internal library work as good as the STL. And so I think performance is a very important part of getting code to be reusable because anytime you have something that's not optimal, there's going to be some case that comes across that needs it to be optimal or needs a faster version. And you almost get this peace of mind when you know the STL version is about as good as anything that I could write. And you know it's going to do exactly what you think, and so you can feel confident
Starting point is 00:52:13 in plugging the operation that you want to in there. And there was a lot of confusion that comes down, the confusion regarding STL that was in the air around the time that it came out. You've probably heard stories about how people in video games said we're not going to use STL. A lot of this just comes down to they just didn't understand it. Some of it's maybe compiler support
Starting point is 00:52:34 for the platforms that they were looking at. Even if you're not going to use vector or map or any of these things, it's like the find operation, you can't write a better find. It's just not going to happen. I think a lot of this is people not being aware of what's in there and how it's designed. And performance is one of these things that makes it so reusable and able to, you know, withstand the test of time with all these different data sets that we have. Well, Justin, it was great having you on the show today.
Starting point is 00:53:00 Thank you so much for telling us all about this course and your thoughts on Alex Stepanov's work. Like I said, we'll put the link to the show notes and the link to YouTube and to your course notes into our show notes. Thanks for coming on the show. Yeah, thank you. And I want to comment that if anyone has corrections, they're certainly welcome or additional footnotes, that sort of thing. So I appreciate you guys taking the time to talk with me about it. Awesome. Thanks a lot. Thanks, Justin. Thanks so much for listening in as we chat about C++.
Starting point is 00:53:31 We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in or if you have a suggestion for a topic. We'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at Left2Kiss on Twitter. We'd also like to thank all our patrons who help support the show through Patreon.
Starting point is 00:53:57 If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com slash cppcast. And of course you can find all that info and the show notes on the podcast website at cppcast.com Theme music for this episode was provided by

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.