CppCast - Efficient Programming with Components
Episode Date: August 19, 2021Rob and Jason are joined by Justin Meiners. They first talk about a big boost library update, and whether Valgrind is still useful compared to sanitizers. Then they talk to Justin Meiners about Alex S...tepanov, his contribution to the STL and some of his courses that are still relevant to today's C++ programmers. News Boost v1.77.0 Intel C/C++ compilers complete adoption of LLVM Valgrind - A neglected tool from the shadows or a serious debugging tool? Links Efficient Programming with Components - YouTube Playlist Efficient Programming with Components - Notes by Justin Meiners Merge with fewer comparisons (using goto) Sponsors C++ Builder
Transcript
Discussion (0)
Episode 313 of CppCast with guest Justin Miners
recorded August 18th, 2021.
This episode is sponsored by C++ Builder,
a full-featured C++ IDE for building Windows apps
five times faster than with other IDEs.
That's because of the rich visual frameworks
and expansive libraries.
Prototyping, developing, and shipping are easy
with C++ Builder.
Start for free at Embarcadero.com.
In this episode, we discuss a Boost Library update.
Then we talk to Justin Miners.
Justin talks to us about the teachings of Alex Stefanov
and what we can still learn from him today. Welcome to episode 313 of CBPCast, the first podcast for C++ developers by C++ developers.
I'm your host rob irving joined
by my co-host jason turner jason how you doing today i'm all right rob how are you doing doing
good uh just a few more days until uh school restarts here in north carolina so getting a
little anxious about that uh anything going on with you no i uh school in my area apparently
has already restarted because i have to jog past all the middle schoolers waiting for the school bus.
Now I'm out exercising in the morning, which is a little awkward.
Especially when I'm doing laps around the park or something.
Although I will say I did get a comment from a friend yesterday that they really appreciated how I tried to turn last week's
episode around on you and ask you more questions about yourself. Well, on that note, I do have
this piece of feedback. I was reading through the comments and read it from last week's episode. And
yeah, I got there a couple people kind of echoing that they're happy to hear some more about what I
do in my day job. And just a couple of things to comment on.
I was wrong about C Sharp and GoTo.
C Sharp does have GoTo, but it's not used the way it's not used in C++ these days.
And then this other commenter was saying, interesting episode.
We finally get to hear a bit more about Rob and his work.
And C++ CLI gets mentioned about a half hour in.
I recently tried some things
with C++ 20 and got the error that
C++ CLI mode does not support
C++ versions newer than C++
17. I'd be
interested to hear whether Rob
agrees that C++ CLI is a dying technology.
And I'm just going to echo
what this other commenter replied, that no, I definitely
don't think it is. It's a shame that you can't use some of the C++20 features with C++ CLI.
But in the.NET space, you know,.NET framework has been around for years and had new versions
come out every few years. But now the focus has been on.NET Core, which is like multi-platform.NET. And C++ CLI support was added in.NET Core 3.1.
The only thing is it is still limited to Windows
because.NET Core is multi-platform.
You can use C Sharp on Linux and Mac now.
But the C++ CLI, unfortunately, is still limited to Windows.
I personally hope that changes.
I mean, doesn't it sound like Microsoft is
deprecating it if they're not going to support
C++ 20 with C++ 11?
But they went through the effort
of bringing it into.NET Core.
So the first three versions
of.NET Core did not have it,
but then they went and added it
in.NET Core 3.1.
So if it never got added there, I would agree
with you that, yeah, it's a dying thing.
If you want to keep using it, you can only use it with.NET Framework.
But they did go through the effort of adding it,
so I don't think it's a dangerous technology.
Yeah.
All right.
Well, we'd love to hear your thoughts about the show.
You can always reach out to us on Facebook, Twitter,
or email us at feedback at cppcast.com.
And don't forget to leave us a review on iTunes or or subscribe on youtube joining us today is justin miners justin has been a professional programmer since
he was 16 his areas of interest are 3d graphics and unix systems in addition to c++ he enjoys c
lisp and swift he holds an ms in mathematics where he studied topology and computer algebra
and besides single work he enjoys studying ancient Greek philosophy.
Justin, welcome to the show.
Thanks. I'm excited to be talking with you guys today.
I'm kind of curious what you were being paid to program on at 16 years old.
Yeah, that's a good question.
So at the time, iPhone apps were really popular.
And through some experiences with some other friends,
I had gotten into Cocoa and Objective-C programming. And so pretty much everybody was looking to hire iOS programmers.
And I was going to a user group at the time. And I sent out a message where I said, you know,
these are some skills that I have, but I'm sort of 16 years old. Is anyone interested in hiring me?
And quite a few companies actually responded. So I interviewed a bunch. These weren't people that
were in my family or anything like that.
So yeah, they hired me and I started working in high school.
That's cool.
So this is just part-time, but you are still actively going through high school and everything?
That's pretty cool.
Yeah, it was part-time and then full-time in the summer.
I just loved it.
It seemed at that point that high school was really boring and I wasn't learning anything later.
I did actually go to college, as you can see.
So it didn't completely scare me away.
But for a while, I thought, oh, you know, the school stuff's kind of a waste of time.
Kind of makes me think of the mid-90s kind of era, too, with like the real start of the.com era.
Yes, thank you, Rob.
Where if you could, you know, write a few lines of HTML, you could probably find a job at the time.
Yeah.
To my credit, it was a little better than just writing a few lines of HTML.
But yeah, it was a great experience.
I don't think it would be quite the same today to be able to do that.
That's pretty cool.
Those early iPhone and Android app days were wild.
I still remember the...
What was the name of the app or just showed like a pretty
red gem and they charged you like nine hundred dollars for it i don't know max was right
i did have a co-worker who was like jumped immediately on board with the iphone thing
as soon as it came out he like he actually printed out a mock-up of one at work just to see
and like folded like the paper craft at work to see just how big it was going to actually be when it was going to be released.
And it ended up being laid off, but just jumped headfirst into iOS development.
And his entire career has been iOS development and consulting and speaking and whatever since then. Yeah, for quite a few years after, I would get random acquaintances,
either from my neighborhood or through my friend group,
who would be like, hey, can you develop an iPhone app for me?
And I had to have some screener questions of like, okay,
how are you going to support this thing?
Like, what kind of budget are you looking at?
And, you know, very few projects got through that filter.
They're like, well, I've got $10
and I would like you to make a copy of Facebook by tomorrow.
Yep.
All right.
Well, Justin, we got a couple news articles to discuss.
Feel free to comment on any of these
and we'll start talking more about this course
of Alex Stepanov, okay?
Okay.
All right.
So this first one is a version update to Boost.
This is version 1.77, and a couple new libraries being added.
First one is Describe, a C++14 reflection library,
and also Lambda 2.
Both of those are from Peter Dimov.
And then some pretty big updates to some of the existing Boost libraries.
ASIO has a pretty long list of improvements, as well as File System,
which is now going to be File System v4,
which sounds like it's going to be closer to Stood File System.
Yeah, I was reading this stuff, and I'm thinking...
Well, a couple of things.
First one that you glossed over at the top is this new idea of a basic any which is a simplified any version any type that you can
configure for how big it's small object optimization is which i think could be really useful but i was
thinking about asio and file system now the c++ 23 networking or whatever is based around ASIO largely, right?
Right.
And boost file system is what led to standard file system.
And I'm looking at how these things, like the boost versions of these things are not staying stagnant, right?
These are moving targets.
They're growing and developing in a way that the standard library ones just simply can't sure and it made
me kind of come back around to this question of well do these large library components belong in
the standard yeah i mean boot boost will always be out there and as you're saying boost is uh
continuing to grow as well as other you know large libraries similar to it. I think there's probably a lot of people in the standards committee who
would agree with you that these things just don't belong. It'd be easier to do without them.
I don't know. Justin, what are your thoughts on this?
Well, I think Boost is well known as the thing that you reach to second as soon as the standard
library is too small for your needs. So I think that there's value to having it there,
especially as we can see that they're able to update it at this speed
and make this many changes.
And so I think that it's good that it's a separate project.
But yeah, I think that they just have a good reputation
for being as high quality or robust as the standard or close to it.
I guess that's one way to look at it is once we get networking in,
if the standard networking or the standard file system doesn't quite meet your needs,
then it should be relatively easy to just swap over to the boost versions of those things.
Right.
Yeah, that seems to be the whole point of this std file system or boost file system v4,
that they're trying to make it easier to go between std file system and boost file system.
I actually do that in a couple
projects, and it hasn't been...
We've only hit a couple of pain points.
It's cool that they're helping
along.
Even just small extensions, I remember
for a while when they did smart pointers, it was
like the boost has a few different smart
pointers that didn't make the standard.
They replaced the old ones with the
ones that were standard and then just had their extension ones.
Right.
Okay, next thing we have is a post from Intel,
and this is that Intel C++ compilers have completely adopted LLVM.
I don't think we've talked about the Intel C++ compilers in a while,
but they're still out there.
Obviously, they're getting lots of updates,
and this is a pretty big change for them
to go from their own tooling to using Clang.
Yeah, so I've always heard about these compilers,
but I've never obviously used an Intel compiler.
Do you know who is the market for those specific ones
as opposed to Visual Studio or LVM?
Generally, scientific computing, generally.
The same people who are going to buy the Intel Fortran compiler
because it's highly optimized for Intel processors.
So if you know that you've got a giant Intel-based cluster
and you're trying to optimize your scientific application,
you'll drop some money on their compiler
so you can get things highly tuned for your cluster.
I see.
Focus on parallelism as well.
Right, yes, right.
I see.
So the Intel engineers can get in there
and they understand the processor in a way
that perhaps the LLVM people don't,
and so then they can generate all this code
that specifically uses a processor.
That's right.
Yeah.
Yeah.
I do find this interesting, you know, all this code that specifically uses a processor. That's right. Yeah. Yeah.
I,
I do find this interesting,
you know, coming back to Patricia's trials and tribulations with trying to make her own
fork of Chrome that she can keep up to date.
And I wonder if the,
so the large argument from the Intel team here is that,
you know, starting from LLVM, they're starting from a more modern code base and it'll be easier for them to keep it up to date and whatever.
And I found them wondering, I found myself wondering here if the Intel team is going to find it actually harder to maintain a fork of LLVM than it was to maintain their own compiler. That's a good point.
I think that these sorts of decisions
just speak to how expensive
it is to develop these kinds of
major products. Like C++
compilers at this point are just
so complicated and have so many years
of investment in them that just keeping up
with LLVM is just really expensive.
And similarly,
you can see similar parallels
with Chrome and JavaScript compilers
and that sort of thing.
I think we should feel a little sad
because in the last few years,
we've watched the Borland compiler
completely go away and be replaced with LLVM.
The Intel compiler's gone away
and replaced with LLVM.
We have three compilers left.
One of the main strengths of the C++ ecosystem
is that we have more than one implementation of our compiler.
Let's keep supporting the three that we have left.
I doubt GCC or
Microsoft Visual C++ are going to go anywhere anytime soon.
I fully agree GCC is not going anywhere.
If Microsoft, now don't anyone start any rumors here or anything,
but if Microsoft in the next five years were to announce
that they're going to stop supporting their own compiler chain
and just use LLVM also, that would not shock me.
I guess they do have some client integration already, yeah.
And they dropped their own browser, so they've only got Chrome now.
I mean, we've got effectively only two web browser engines left also.
And that's, I feel like kind of critical to the ecosystem to have more than one rendering engine or something as important as the web.
So download Firefox.
For sure.
This is kind of a difference in ecosystem between like C programmers,
where it's like there are all these hobby C compilers are specialized. You could imagine a team of two or three people making a production C compiler, just not going to happen with C++.
No, not not. I mean, you might be able to pull it off in a couple of years and have something
that's good, but then keeping it up to date is a full time job. That better be what your plan is.
Yeah. Yeah. All right. And this last post we have is a blog-time job. That better be what your plan is. Yeah. Yeah.
All right.
And this last post we have is a blog post about Valgrind.
Is it a neglected tool from the shadows or a serious debugging tool?
I'm talking about,
you know, some of the differences in what you can detect with Valgrind versus the
sanitizers.
And yeah,
there's definitely still some valid uses for Valgrind, right, Jason?
I absolutely think so.
I definitely use it.
Like they said, if I can't recompile the project
or depending on what exactly I'm trying to measure,
because Valgrind, since it works as an emulator,
it can give you an exact count
of how many instructions were executed.
And there's no way to do that with outside of Valgrind
because anything like OPR for whatever is like a sampling profiler.
And it can tell you approximately how many instructions were executed
when your program was run.
So depending on what you're trying to collect
and what you're trying to do, I absolutely use it still.
I thought it was interesting in this article to explain a little bit
about how Valgrind works in that it essentially reads the instructions out of the binary and then emulates those on its own virtual machine.
And that's pretty cool. I hadn't really thought about how Valgrind worked before in the past, so that was interesting.
Yeah, that's why it's 100 times slower than using the sanitizer. Yeah. Yeah, and that is the thing the article points out, that that is the main downside to Valgrind versus sanitizer,
that you're going to have a huge amount of memory overhead
and the program's going to really just be slow,
very, very slow to run.
But it'll give you lots of good info.
Rob, since you've, like, focused on Windows ecosystem,
I think, for most of your career, right,
have you ever used Valgrind?
A long time ago. It's been quite a long
time, but I do wish there was something more like Valgrind on Windows.
There is a tool called Dr. Memory, which is not
too dissimilar, that works on all three major platforms.
Have you ever tried that one? I don't think we've ever brought it up on the show.
Not that I recall.
You should play with that and report back next week.
Maybe, maybe.
Okay, well, Justin, we're going to talk about Alexander Stepanov today.
So for those listeners who aren't familiar with Alex,
could you tell us a little bit about him and some of his contributions to C++?
Sure. So Alex is best known for being the creator of STL, along with Dave Musser and Ming Li.
And you might wonder, you know, why is this worth talking about 25 years later?
And I think that that comes to mind because we typically think of libraries as just a set of common calls for doing something.
You know, Python standard library, you have, you know, here's how to parse JSON.
Here's how to set up an HTTP server.
Here's how to do an HTTP client.
So I'm sure we've all written libraries at our time in our careers.
You know, why is this of an ecosystem and an embodiment of the theoretical work that Alex had done over his career of a new style of programming, a new way to think about programming.
And so it's really like this great intellectual achievement about thinking about the questions of like, what are programs?
How do we write code?
And so it's had a tremendous effect on the ecosystem. And I think lots of other languages now as well, like Swift. So just a little bit about Alex from a biographical
perspective. So he's a Russian programmer, studied math in the Soviet Union. He moved to America,
worked for a lot of different research roles at large companies at places like HP,
Adobe, A9, SGI, AT&T. For a while, he was also a professor at Polytechnic University, which is now part of NYU. And he describes that time as being really interested in the foundations
of programming. He said that he studied computer science first by reading Aristotle, by trying to understand the categories of reality to sort of model those on a computer.
And a lot of this research showed up in some work in Ada.
He's done some work in Scheme and Common Lisp.
And a lot of that work was done with Dave Musser.
But this has been a topic that he's been interested in for a long time.
And so after he had done a lot of that work in Ada,
he had the opportunity to write this library for C++.
That was sort of going to be the library that handles containers
and algorithms and all that sort of thing.
And so how do you actually,
it's defining how you write algorithms in the language,
how do you write reusable code?
And so he went through that process.
He wrote the STL.
He worked with the standards committee to get it standardized. He describes this as like a very stressful process to work
with the standard and just how difficult it is to get things with the committee and how
a lot of things that he understands really well or that's based on a lot of research
they didn't understand or that sort of thing. So he gets the STL standardized.
Most of his code, you know, he wrote almost all the code that was in the original STL
that was distributed on all the major platforms on Apple, Microsoft, and Linux.
Now, of course, those have changed over time.
But, you know, as soon as he gets it standardized,
he sort of stops being closely involved in the
standard committee because
it's just so stressful for him.
And so a lot of people don't really know about him.
If they hear the name, it sounds like he's just a crazy
old guy, but he did this
important work for us. And
the other thing that he's done since then that we're sure
most people know is he's written two books. The first
one is Elements of Programming,
and the second one is from Math Generic Programming.
And these sort of explore these theoretical ideas behind the STL.
Besides that, what I find really interesting about Alex is that he's not just a programmer.
Alex is really a programmer who is interested in a lot of intellectual areas.
He's interested in philosophy.
He's interested in history.
He's interested in politics. He's interested in philosophy, he's interested in history, he's interested in politics, he's interested in math. And all of that understanding he brings
into programming, which seems a little weird, maybe seems a little bit different than what
most people do, but he does it in a way that's really compelling. And so he has deep reasons
for everything that he does, even as simple as, should we have the less than operator be the
primary one or the greater one?
Well, when you were a kid, you learned to count up from one to three. And the only other time you count down is something like you're launching a missile or a rocket or something like that. So
the right thing to do must be to count up. And so that's kind of the way that he approaches a lot
of things. And so a lot of people that are in our community, the C++ community, there are lots of
really smart people, lots of really hardworking people,
lots of people who have contributed a lot.
But I think Alex is special in a way that he is a serious thinker that's worthy of study.
But if you want to understand programming on a deeper level,
I think that he's somebody that we're going to study years into the future.
So I'm curious, you said when he was designing the STL,
or when he was doing this early research, that it wasn't object-oriented programming.
It was a new way of thinking about programming.
So would you say that the STL isn't object-oriented?
It isn't functional or procedural?
What is it?
Yeah, so the school of thought or the paradigm of programming that's ascribed to his style is called generic programming.
And this term is a little bit misleading because people hear generic programming and they say,
oh, I've used generics. I know what that is.
But no, it's actually a lot more, and I'll talk a little bit more about exactly what that is.
But yeah, I would say that the STL is not object-oriented programming
in that it's not about building inheritance trees.
It's not about building inheritance trees.
It's not about creating polymorphic interfaces.
It's about encoding algorithms in generic ways that can be reusable and describing the mathematical interface required for them to work.
So you asked also about functional programming.
There's certainly a lot of elements that are functional in that they're inspired from Lisp
and they follow the principle of taking inputs and producing outputs without side effect.
But it's distinguished from functional programming in that all these algorithms rely on state heavily internally.
And Alex talks a lot about this, how functions are great for math, they're great to do on paper,
but we have actual computers in front of us that we need to write programs for.
And it's silly to ignore how the actual program works in order to write code.
So there are lots of good things we can learn from functional programming.
We can learn about mathematical induction.
We can learn about recursion.
We can learn about, like I said, these avoiding side effects.
But a lot of the algorithms that we implement, they're machines.
They have state.
They set flags. They do all kinds of messy things inside.
Okay. So Alex did this course, Efficient Programming with Components.
And the reason we're talking to you today is because you decided to write these detailed notes about the course.
Do you want to tell us more about the course and what prompted you to write up all these notes? Yes. So these notes were a course given at A9 as they were teaching their own employees
about how to better write code, how to better understand the STL. And specifically, it focuses
on how do we write reusable code? How do we write components that we can reuse in different ways and the way he teaches that is by writing a vertical slice
of the stl so you start out with very basic functions how do we write min how do we write max
and he works all the way up to more complicated things like stable sort and so it's really
insightful to see both how much thought goes into each of these algorithms you may say oh, oh, min and max, you know, that's a waste of time.
I'm not going to learn about it.
Well, he says the max and the standard is actually wrong.
He got it wrong.
So there's all these details that you need to work through.
And so you can see his thought process in designing it.
But then also you just gain a greater appreciation of how the STL works and a lot of understanding.
And there's two main concepts going on in the course.
So he's talking about algorithms.
How do we write algorithms in a reusable way?
What are the kinds of problems that we need to think about?
How do we wrap them up in interfaces as well as concepts?
And most people have heard of concepts as the C++20 feature,
but concepts have been something that's been around for a long time.
If you go read the original SGI STL documentation, I believe it's still on the Boost website. It talks about concepts.
It says, here's what a forward iterator is. Here's what a regular type is. Here's what a semi-regular
type is. And that's what this course is all about, is defining those fundamental concepts and then
using those to write algorithms. And the reason why I wrote these notes
is I actually came across this course when I was younger and I didn't quite get it. I watched
through the videos and there's lots of interesting things going on. He taught me a little bit about
constructors, about value types, about caches, but it was mostly just, oh, this is interesting.
It seems like there's a lot more here, but I don't really understand all the things that are going on. So I came back to it years later, and I'd
studied a lot more computer science, studied a lot more math, ontology, common LISP. And watching
it that second time, it was like there was so much richness in all the things that he was talking
about. It's linked to so many different areas. There's so much depth in all that he's talking
about. And so I wanted to provide a way to share this experience because the videos are a little bit tedious to watch.
You have to deal with people interrupting and asking questions and, you know, they'll actually
like make mistakes. And then a few courses later, they have to come back and correct their mistake.
So I wanted to correct all that, get it nice in text, almost as a historical record of what he actually said because there's so many interesting
things in there but then also a lot of the material is covered in his books some of the
theoretical material but you don't get it integrated with this history with these jokes
with these uh you know sort of like absurd statements all of that like needs to be part
of the experience to understand what alex is. And so the books are a little bit lacking in this area. So that's
another reason why I thought the course was important. Was this like available on YouTube
or something, the original course? It is. Yeah, you can find it on the A9 YouTube channel. And
there are actually a few other courses that are similar. Some of them cover a lot of the same
material. Others are more about his math, generic programming style of teaching.
But this is the one that I feel is the most important as far as understanding what Alex is about.
And so then you've also published your notes separately, is that?
That's correct. And my understanding is that that will be linked to in the show notes.
Yeah, I'll put show notes out with both a link to your notes and the original YouTube links.
Yeah.
Great.
I want to end up the discussion for just a moment to bring you a word from our sponsor, C++ Builder.
The IDE of choice to build Windows applications five times faster while writing less code.
It supports you through the full development lifecycle to deliver a single source code base that you simply recompile and redeploy.
Featuring an enhanced Clang-based compiler,
Dinkumware STL, and packages like Boost and STL2 in C++ Builder's Package Manager,
and many more. Integrate with continuous build configurations quickly with MSBuild,
CMake, and Ninja Support, either as a lone developer or as part of a team.
Connect natively to almost 20 databases like MariaDB, Oracle, SQL Server, Postgres, and more with FireDAC's high-speed direct access.
The key value is C++ Builder's frameworks,
powerful libraries that do more than other C++ tools.
This includes the award-winning VCL framework
for high-performance native Windows apps
and the powerful FireMonkey framework for cross-platform UIs.
Smart developers and agile software teams write better code faster
using modern OOP practices
and C++ Builder's robust, and feature-rich IDE.
Test drive the latest version at Embarcadero.com.
So what algorithms or other parts of the STL did Alex talk about in this course that you found particularly interesting or useful?
Yeah, so a few interesting
things are the ones, the very basic algorithms that he re-evaluates. Like I said, min, max,
swap. There's a lot of thought that goes into those, but there's two that caught my eye as
really interesting and exciting and things that I had never thought about in this way before.
So the first is one that's called, it's called like a balanced reduce.
And we're probably familiar with a left fold or reduce where you have a range of elements and you have a binary operation and you want to combine all those elements using that binary operation.
Like an accumulate or something, basically.
Yeah, yeah, that's right.
And the standard is called accumulate.
And so, you know, you might apply plus to a range of integers to add them all up.
And what that does is it sort of starts from the left and combines it with the element next to it and merges those up.
And then it continues on to the third element and merges those up and goes all the way down the list. And what this algorithm that Alex comes up with does is it changes the order of the
evaluation so that things are combined only when they've been combined with an equal number of
other things. So what that kind of means is you like divide the range in half and you merge up
the first elements and then you take the second half and you merge those and then you merge those
and so on up and building this this tree of
association instead of like a linear uh instead of a linear association and it does this in a way
that's really efficient it uses like binary counting the same way you learn to add binary
digits and it has so many applications so the ones that he goes through, you can write merge short with this. The operation that you apply to the, you know, you take a list of lists.
And the operation that you apply is something that can merge to sorted lists.
And so you just do that.
And it does merge short.
And it does it perfectly balanced, you know, where it's split evenly.
You can use it to add up large amounts of floating point numbers. So typically doing an accumulate for floating point numbers is bad because as
you add up the list,
you eventually get a really big number and the ones that are left are really
small.
And if you know anything about floating point precision,
those small numbers just can't be represented very well to be added to this
large number.
But using a balanced reduce, you tend to add up things that are similar size.
And so you can get much more accurate.
You can get much more accurate additions that way.
And the other application that he does with it is this finding the smallest and second
smallest element in a list, which seems like a very specialized problem.
But he does it in a way that's really efficient.
And it's really interesting just to see another application of the balanced reduce.
And so this is actually one that I've gotten interested in.
I wrote a proposal for the Swift standard library as well.
We'll see how that goes.
I have no idea.
Have you submitted that proposal then?
I submitted it on the forums,
generated a little bit of discussion.
People were mostly fixated on the application for adding numbers, and they said, well, we're going to maybe
get this in numerics. So not a lot of people had other comments on it. So it doesn't have a proposal
number that anyone could go look up right now or anything like that? No, and the best place for it
might be the algorithms package as opposed to the standard itself, because I guess that's the place
where they're staging those things now in a similar way to Boost, I guess. So the other algorithm that I
find really interesting is this merge sorted lists. So this is the operation that's applied
to merge to lists in that merge sort, which is you have one list that's sorted, and you have
another list that's sorted, we want to combine them in a way that so that the end result is
sorted. And this is a linear algorithm, you can kind of walk through step by step and you sort of just keep taking
the smallest element on either side until you've made the final list. But this one that he writes
is for the case when they are in the same buffer. So you have a array. The left side of the array
is the one list to be merged. And the other side of the array is the other part of the list to be
merged. And then we want to combine
these or mesh them together so that the whole thing
is merged. And he just finds
this beautiful recursive solution to
do it where you
kind of rotate the one into place
and it's a little bit hard to describe.
But it was really exciting and it's
very efficient in that
unlike these other ones that are linked lists, you're not building a new buffer.
It's the same buffer, and it's all about these rotating and swapping elements.
And so they're just really exciting to see applications of these computer science principles that we know about recursion, about induction, and seeing them play out in a beautiful way in C++.
Sorry, but when you just said that you would use that they were rotating
the elements in I had an immediate like Sean parent like it's a rotate like kind of flash.
Anyhow, sorry. All right. So it seems like a lot of what I learned in computer science about
the efficiency of algorithms and such has just kind of been thrown away lately because it didn't
take into account modern caching architectures and design of CPU such has just kind of been thrown away lately because it didn't take into account
modern caching architectures and design of CPUs
and that kind of thing.
And I'm kind of curious from what the stuff
that you've watched,
like how do these algorithms hold up
on modern architecture
and how do we measure and even reason
about the performance of them?
Yeah, so a while ago,
maybe two episodes ago,
you guys were talking about
applications of GoTo.
You remember that?
So if you're like me,
sometimes I'll get the optimizer bug
and I go into a problem and I say,
oh, you know, I know all this stuff
about caches and I know all this stuff
about branch prediction
and I'm going to rearrange this way
and it's going to be really fast.
And so I do that and then I run it
and that's obviously slower and I have no idea what i'm doing you guys have that experience uh i i don't tend to
optimize to that level i tend to go hey can i make this code simpler and just let the compiler do
more work that tends to be my my mode that's probably the right thing to do so alex alex
is not that guy so what he does in one algorithm
he has this merge shorted list once again this this algorithm i'm talking about and he looks
at it and says you know we're doing one more comparison every time than we have to do so i'm
going to rewrite this algorithm and i'm going to use go to to maintain the state of which side i'm
on whether i'm on the left side of the list or the right side of the list. And I'm going to remove this extra comparison. And I actually ran this.
I benchmarked it because I couldn't believe it. It's a 20% to 30% speed up from him using GoTo
to reduce this comparison. It's unbelievable. And so, yeah, a lot of the stuff that he talks about,
almost every specific example he went to, as I do the same
benchmark they do, I get the same results in that the way that he teaches is faster. So if you're
sort of an intermediate level programmer, you haven't thought much about cache, about locality,
about linked list type structures, about dispatching on virtual, then I think everything you're going
to learn in the course is really relevant. The trends are still the same in hardware. We have slow memory to read and write
from. We have pipelining and we have fast CPUs. So those things that he teaches about that are
still very relevant. But if you're going to take it to the next level of optimization,
obviously you need to be up to date with the newest hardware. But I think he's right at the
line of where it's still good stuff to learn. As far as performance measurement, they do a few interesting things.
So obviously they have a time measurement and it's what everyone else does. You can mark,
here's how many clocks I have in the CPU at the start. Here's how many clocks I have in the CPU
at the end. And let's figure out the duration of an algorithm. But he also creates this class
called instrumented. And this is based on
the concept of regular types. Regular types are types that behave like we expect, like integers.
They have constructors. They have assignment operators. They have, oh, I'm forgetting all
the things. But anyway, he writes the singleton, which wraps any regular type, and it counts how many of each operation it does.
So it counts how many times you construct it.
It counts how many times you copy construct it.
And then you can run an algorithm, and you can see exactly how many of those operations it does.
So they use both of these tools to kind of analyze the algorithm, and one is more based on the traditional computer science.
How many operations are we taking? And you can normalize these things to common functions. You can look at how does it
compare to log n or something like that. But what's really interesting is that, well, first of
all, it's useful because it stops your code from optimizing away the thing you're trying to
benchmark. It at least has to count the operations correctly. So it won't just not do your problem,
which is a common problem with benchmarks.
And then another thing that it does
is it allows you to have both these perspectives.
Is it the duration or is it the operations?
And the reason why that's useful is because
there's always this problem in benchmarking of,
well, it's fast on my machine.
I don't know if it'll be fast on machines in the future.
And then also with generic programming,
you're always writing generic code.
So it's fast for float. It's fast for int, will it be fast for vector of map. And you know,
you can't really get that with just the duration. So the counting operations helps here. And he
gives examples of times when the one tells you the information of what's fast and the one where
it's the other meaning. One example he looks at is calling std unique
versus using a set to remove all the duplicate elements in a list and if you count operations
the set is actually better the set uh the set does exactly what you um expect it's using some kind of
tree or hash sort of thing to pick out the elements individually.
But with Unique, you have to sort the range, and then you call Unique, and it will use more operations.
But yeah, it's much better as far as the cache goes, as far as the kinds of operations it's doing.
So both are useful perspectives, and I think just understanding that class, that instrument class, which counts operations would be useful for a lot of programmers. I find that really fascinating, actually, because I've written that kind of
instrument class probably several dozen times at this point while teaching classes, so that I can
show my students, oh, well, you know, when we do a pushback into a vector, this is what happens,
we can watch the elements get moved into the new memory location or whatever.
But I've never considered once using it to actually observe the efficiency of code that I wrote. I've only ever used it to observe the behavior of the standard, basically. That's
really interesting to me. Yeah. And you might want to look in the code they have some nice ways to
sort of format that information in tables and ways to adjust that like i said with scale so you're
not looking at oh you know i can't really compare a hundred billion to a thousand it'll it'll
normalize it based on one of these growth functions so you can tell how closely it matches
n squared or log n that That's pretty cool. We mentioned
concepts earlier.
How do you think
C++20 concepts feature
matches Alex's original
vision of the STL?
Yeah, this is kind of a tough question.
But I think the best way
to answer it is to
understand a little bit more about what concepts are.
So like I said, concepts are
not something new in C++. They're something that exists outside of C++. They're something that
exists outside of programming even. They are like mathematical and logical structures. So
an example of one of these are things from abstract algebra for people who are into math.
We have groups. We have rings.
Most programmers are probably familiar with vector space.
And what a concept is, is it's something that has, it's a type. You have a type, and types have associated types,
and they have associated values, and they have associated operations.
And so you have, you know, vector space has the scalars and vectors,
and you can multiply scalars by vectors and you can multiply scalars by vectors.
You can multiply scalars by scalars.
You can add scalars.
You can add vectors.
And all of these operations, besides having access to them, they have restrictions on how they're supposed to behave.
So, for example, the scalar addition or multiplication needs to be associative.
It's not associative.
It's not the vector space.
And so concepts, as Alex has talked about them, he's always described the concepts in comments that go before the code.
So he'll say sort requires a forward iterator.
Okay.
And, you know, the forward iterator is just like one of these vector space things.
It says you have a value type, you have an advanced function,
and that sort of thing.
And so what the C++20 concepts allow you to do
is essentially define a restriction on a type
where you can list sets of constraints,
and the constraints are little blocks of code
that have to compile with the result that you expect
in order to pass the concept.
So for example, you could say a forward iterator is a concept,
and it needs to have a value type member,
and that value type member needs to be a regular type or something like that.
Or it needs to have an increment operation,
and that increment operation needs to satisfy some other constraint.
And so the fundamental thing here, I think, is type predicates, things that you can evaluate
on types to ensure that they're true or false. And my understanding of the concepts in C++20
is this is the concepts light and not the more complicated concepts that they were going to do.
And from what I can tell, it does this type predicate,
this associated types well, and it acts as sort of like a better documentation
to be able to write things like, here's a forward iterator,
and now you're going to go implement a forward iterator.
The compiler is going to make sure you actually got all the things in there.
And similarly, you can write an algorithm.
You can say, oh, it takes a forward iterator,
and it will check all those things in ways that it didn't before, whereas before you
had to remember the list. So I think it's certainly a step in the vision that he had.
There's probably some details that he would disagree about. In particular, as I looked
through the library itself of concepts that are in the standard library, there's not a whole lot
of them, and they tend to be focused on details about C++ programming and not more of these abstract
concepts, if that makes any sense. So for example, there's only one concept I could see related to
ordering, which is totally ordered. And in the C++ library, there's actually algorithms that
use all different kinds of ordering relationships. So perhaps there's some differences there in the C++ library, there's actually algorithms that use all different kinds of ordering relationships.
So perhaps there's some differences there in the actual concepts that made it into the standard library.
If you're curious more about this subject, there is one of the videos in the course, the A9 course.
Bjarne actually comes to their class and walks them through an early version of concepts.
And he walks through, here's how it would look in the code.
And him and Alex kind of have a back and forth about,
yeah, as you can imagine,
they have kind of a back and forth about different features.
I didn't include that in my course notes
just because it's kind of a derailment
from the rest of the course.
And it's about BRNA and not about Alex.
But if you're curious about that topic,
that would be more where you could learn about it. Very interesting.
So you mentioned in your show notes, Paul McJones, but I don't think that name has come up
yet today, has it? No.
So who is Paul and why should we be discussing him?
Yeah. So Paul is a friend of Alex who worked with him mostly at Adobe,
and he is closely involved in his Elements of Programming project.
And he's a very quiet guy.
He is actually there in the Efficient Programming with Components course.
He's there attending.
And he's also in a lot of other lectures that Alex gives about Elements of Programming, but he'll never engage.
He'll sometimes spout off some knowledge about things, but he's a very quiet guy. But he actually plays a really important role for shaping Alex's
theoretical work and shaping the book Elements of Programming. In particular, you know, whenever
you're doing a big writing project like this, you have to determine a sort of voice or a style of
presenting the information. And what Paul really helped Alex do with elements of programming is to structure the text in
the same way that the Euclid's elements is structured or traditional mathematical work
where you have definitions and axioms and theorems and all the sort of discussion text
around that information.
You throw it away, say all the information should be there. And if you really understand the definitions and the axioms, then you'll understand the
material. And so, like I said, Alex is this big character. He has lots of opinions about things.
And Paul kind of helped him to focus that into the elements of programming text in the way that
we have it. And I think that that text is challenging for that reason. And a lot of people may not get a lot
out of it because of that. But for people, I think it's important to frame Alex's work,
have one representation of Alex's work in that same way that Paul helped him do.
And so, yeah, I think Paul studied math as well. If you want to see more about him,
there's a lecture that Alex and Paul give at Stanford about the Elements of Programming book.
And Paul participates in that one a little bit more and shares some opinions.
So that's all I have about Paul.
So we've talked about algorithm design and performance and analysis up to this point.
Does Alex talk about correctness of code or safety or other concerns?
Yeah. So one of the questions he brings up is how do we know code is correct or what makes code
correct? And the first answer that comes to mind as engineers is we might say, well, code is correct
if it follows the specification. Now, Alex points out, you almost never have a specification
unless you're implementing a standard or you're emulating something.
You're the one coming up with the specification.
And he also has examples of code that he would say is written incorrectly.
So one example is in C, you have this B search algorithm, which is a binary search implementation.
And he says it's incorrect because it doesn't return any
useful information. What it returns is the value that it finds. But the whole reason why you want
binary search is because you want to insert things or delete things. It's a very general operation.
So it returns the wrong thing. So it's incorrect. So he defines correctness in this way of it not
only has to sort of satisfy a specification, whatever that means, but it has
to do something useful. And so whenever we're writing out, we need to think about what's
actually the thing that it's supposed to do, what's its job, and how is the person going to
use this? And so one of the very practical things he talks about is make sure you return all the useful information.
If you did work to do something to find an answer, just return that work as a separate argument.
They might need that work.
You already did the work.
Don't make them do it again.
So a lot of the STL algorithms follow this.
So, for example, the find or the delete, they'll return the iterator of where they ended up.
So you can continue the find.
You can continue the delete. You already did the work, you don't want to do it again.
And that's an important component of reusability. As far as safety, he makes this big emphasis on
preconditions, and obviously satisfying the constraints of the concepts that an algorithm
requires to work. So, you know, if an algorithm says this is a forward
iterator, it doesn't need to work on things that are not forward iterators. That's not safety.
Safety is not, you know, if you plug the wrong thing in, you might not get the right answer.
Safety is when you satisfy the preconditions, then you're guaranteed to get the post conditions.
And this is represented in a practical way in the way he writes algorithms. He'll often
write a version of the algorithm that requires absurd preconditions. He does this quick sort,
and it says there needs to be an element at the start of the list which is smaller than all the
elements before you can run this. But then when you run this, you know, it'll sort the rest of
the list. And then he'll write a version that ensures that precondition is met and then calls that algorithm and this is a style of program that i haven't seen
very many places uh but i think it's it's very powerful to let you think about assumptions that
you're making and to really boil down an algorithm to its fundamental parts because i think a lot of
what we do especially sort of error checking at the beginning or saying is the list empty
is uh do we have all the things that we need?
This is like not essential to the actual part of the algorithm.
It's just making sure some preconditions are satisfied.
And an advantage to that is that then when you have a case where you know the sort of memory safety of like, hey, look, computers have pointers.
Computers have memory.
Computers have resources.
And pretending that those things don't exist, it's often useful to be able to eliminate classes of bugs.
But there is a lot, you know, ultimately you have to satisfy
preconditions to get things to work. You can write bad
code in any language. You can write
wrong code in any language, and so
we have to satisfy preconditions
to get our code to work correctly.
Okay. We've, you know,
talked about several things.
Maybe to wrap up, is there
any, like, particular lessons
you got from studying Alex's work that we haven't talked about already?
I would say that one important lesson besides this whole concept of generic programming, besides this correctness of code, this sort of thing,
I think that the importance of performance is a lesson that I haven't seen taught in the same way. If you think about the
role of writing a standard library component, as opposed to something that you use every day at
your work, you know, or it's just sort of day to day code, the standard library component is going
to be used forever. We've been stuck with these things for 25 years. And you never know what kind
of thing people are going to plug into them. For my master's
thesis, I did some work on something called braid groups. And I had a less than comparison operation,
which takes about five seconds on my data set to evaluate this lesson comparison. And so
you being able to use an algorithm, which counts every single less than operation and ensures that it's minimal is actually really valuable for the kind of work that I was doing.
And I think that that importance of performance and fundamental code, I get that not every piece of code is written that way.
Maybe in your in your project, you're using 80 percent STL, 20 percent your own internal library.
You know, you really want to make that 20% internal library work as good as the STL.
And so I think performance is a very important part of getting code to be reusable
because anytime you have something that's not optimal,
there's going to be some case that comes across that needs it to be optimal or needs a faster version.
And you almost get this peace of mind when you know the STL version
is about as good as anything that I could write.
And you know it's going to do exactly what you think,
and so you can feel confident
in plugging the operation that you want to in there.
And there was a lot of confusion that comes down,
the confusion regarding STL
that was in the air around the time that it came out. You've probably heard stories about
how people in video games said
we're not going to use STL.
A lot of this just comes down to they just didn't
understand it. Some of it's maybe compiler support
for the platforms that they were looking at.
Even if you're not going to use vector
or map or any of these things, it's like the
find operation, you can't write a better find.
It's just not going to happen. I think a lot
of this is people not being aware of what's in there and how it's designed.
And performance is one of these things that makes it so reusable and able to, you know, withstand the test of time with all these different data sets that we have.
Well, Justin, it was great having you on the show today.
Thank you so much for telling us all about this course and your thoughts on Alex
Stepanov's work. Like I said, we'll put the link to the show notes and the link to YouTube and to
your course notes into our show notes. Thanks for coming on the show. Yeah, thank you. And I want to
comment that if anyone has corrections, they're certainly welcome or additional footnotes,
that sort of thing. So I appreciate you guys taking the time to talk with me about it.
Awesome. Thanks a lot.
Thanks, Justin.
Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast.
Please let us know if we're discussing the stuff you're interested in
or if you have a suggestion for a topic.
We'd love to hear about that too.
You can email all your thoughts to feedback at cppcast.com.
We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter.
You can also follow me at Rob W. Irving and Jason at Left2Kiss on Twitter.
We'd also like to thank all our patrons who help support the show through Patreon.
If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast.
And of course, you can find all that info and the show notes on the podcast website at cppcast.com slash cppcast. And of course you can find all that info and the show notes on the podcast website
at cppcast.com
Theme music
for this episode was provided by