CppCast - Large Scale C++
Episode Date: February 6, 2020Rob and Jason are joined by author John Lakos. They first talk about a funny C++ themed freestyle rap video commissioned by Victor Zverovich and a C++20 reference card produced by Bartlomiej Filipek. ...Then John discusses his new book, Large Scale C++ Volume I: Process and Architecture. In addition to discussing the book John shares some of his thoughts on allocators, modules, move semantics and contracts. News Jason's C++ Training Courses in Stuttgart Freestyle C++ Rap for the next meeting C++20 Reference card Links Large-Scale C++ Volume I: Process and Architecture CppCon 2019: John Lakos "Value Proposition: Allocator-Aware (AA) Software" C++Now 2019: John Lakos "Value Proposition: Allocator-Aware (AA) Software" CppCon 2018: John Lakos "C++ Modules and Large-Scale Development" Local (Arena) Memory Allocators Part 1 - John Lakos - Meeting C++ 2017 Local (Arena) Allocators Part II - John Lakos - Meeting C++ 2017 CppCon 2019: Alisdair Meredith, Pablo Halpern "Getting Allocators out of Our Way" CppCon 2019: Joshua Berne "Contract use: Past, Present, and Future" CppCon 2019: Rostislav Khlebnikov "Avoid Misuse of Contracts!" Sponsors Backtrace Software Crash Management for Embedded Devices and IOT Projects Announcing Visual Studio Extension - Integrated Crash Reporting in 5 Minutes
Transcript
Discussion (0)
Thank you. Check out their new Visual Studio extension for C++ and claim your free trial at backtrace.io.
In this episode, we talk about freestyle rapping and C++ reference cards.
Then we talk to John Lakers.
John talks to us about his new book, the first podcast for C++ developers by C++ developers. I'm your host,
Rob Irving, joined by my co-host, Jason Turner. Jason, how's it going today?
I'm all right, Rob. How are you doing?
Doing fine. You had a little announcement to make, right?
Yeah, so I've got two classes that are coming up in Stuttgart at the end of September and
right into the first day of Octoberober there which um are on writing correct
performant code and taking advantage of compile time constexpr templates um topics so i just
suggest that our listeners go and check those out we'll link to those in the notes anyone can sign
up for them yeah we'll definitely put those in the show notes And are you going to Stuttgart just for these courses,
or did you already have plans to be there for a conference or something?
No, that's just for these courses.
Oh, wow. That's great.
Okay, well, at the top of our episode, I'd like to create a piece of feedback.
We got this comment on the website,
and unfortunately we got a couple of similar comments on Reddit.
So this one was from Matt Dawson saying,
very excited about this topic but audio
quality for the guest was very poor and i eventually gave up i guess this is the exception
which proves the rule as the audio quality on cbpcast is usually great um so yeah i i really
want to apologize for the poor audio quality uh both apologize to our listeners and to the guest
because you know listeners weren't able to really hear,
uh,
what Vadim had to say.
Um,
we were just talking before the show,
how maybe we'll invite him on to kind of do a redo.
We've never done a redo before,
but,
uh,
we have not,
but it would be nice to have a demon,
uh,
you know, make sure he has some,
some better audio equipment beforehand.
And,
uh,
yeah,
we'll,
we'll make sure that we,
we don't let that sort of thing happen again. And, uh, yeah, we'll, we'll make sure that we, we don't let that sort of thing
happen again. Um, we kind of both thought at the beginning of the interview that, yeah, this isn't
great, but we we've scrubbed, you know, somewhat poor audio before, but it just was a lot worse
than we thought it was. Yeah. And it kind of seemed like the quality was coming and going a
little bit more like when he was doing the majority of the talking, it seemed like the audio was from my perspective better so i i thought that it would be something
that we could clean up and the listener would never really notice but um yeah yeah that's not
how it worked out yeah unfortunately not so yeah we will definitely keep this in mind and try to be
better uh apologize you know again to all listeners who weren't able to stick it through the whole
episode because of the quality though well we'd love to hear your thoughts about the show you can always reach out to us on facebook
twitter or email us at feedback at supercast.com and don't forget to leave us a review on itunes
or subscribe on youtube joining us today is john lakos john is author of the large scale c++
software design serves at bloomberg lp in new york city as a senior architect and mentor for
c++ software development worldwide he's also an active voting member of the C++ Standards Committee's Evolution
Working Group. Previously, Dr. Lakos directed the design and development of infrastructure libraries
for proprietary analytic financial applications at Bear Stearns. For 12 years prior, Dr. Lakos
developed large frameworks and advanced ICCAD applications at Mentor Graphics,
where she holds multiple software patents. His academic credentials include a PhD in computer
science and a SCD in electrical engineering from Columbia University. Dr. Lakos received his
undergraduate degree from MIT in mathematics and computer science. His new book, the first volume
of which is entitled Large-Scale C++ Volume 1 Process and Architecture, is now available from Pearson Education.
John, welcome to the show.
Well, thank you very much for having me.
It's a pleasure to be here.
I have not, I don't believe, ever heard anything about your history with Mentor Graphics in the past.
I'm kind of curious what you worked on there, if you can mention more about it.
Be happy to.
It's been, geez, it's been 23 years since I left Mentor Graphics.
And I published my first book while I was at Mentor Graphics based on the knowledge that I gained from working there.
I was there 11 years.
And we worked on very large scale at the time, software systems.
And what we worked on is CAD tools.
We had routers.
We had visual editors. And my particular contribution was a, I'm going to say it's a device and wire extraction and recognition system so that you could take layout that was just handmade layout, extract all the transistors and the contacts and whatever.
So you get all the devices out.
Then the second patent is to extract the wires using all-angle geometry,
and I took advantage of vector calculus to make that super quick.
And in fact, being able to extract a 200-segment wire is something that seems like a really hard problem to people.
And what's fantastic is that the early part of this book that I just wrote captures the
power of what's called dynamic programming or memoization, so that even though it seems
like an exponentially hard problem, it's really a very simple quadratic problem if you get
it right.
And so, again, using that wonderful technique, I was able to do very, very large problems in what seemed like an impossibly short time.
And it was quite successful.
We could extract entire semi-standard libraries.
We could extract them, retarget them to a different technology, and then put them out.
What would take a year could be done in a few days.
It was pretty spectacular.
So interestingly, it sounds like, if I might read a little bit between the lines, that the work that you did on graphics early on, even though it seems completely unrelated to the financial industry
today, is still helping you. What you learned there is helping you today still.
Well, when we say mentor graphics, graphics is, I mean, it's really cad right it's really computer-aided
design and yes there's some there's some graphical rendering and whatever but it's more like it's
more it's hard geometry and math and and and compaction and and algorithms it's it's it's not
quite graphics i just want to be clear it's not the front-end kind of graphics, it's the back-end kind of graphics. And yes, the back-end kind of graphics is
right with very interesting, very hard problems
that require heuristic solutions.
Well, John, we've got a couple news articles to discuss. Feel free to comment on any of these,
and then we'll start talking more about your book, and maybe a little bit about the Prague ISO
meeting next week, sure thing okay so this first one is um freestyle c++ rap
uh victor zverevich who we have had on the show before to talk about format lib i guess reached
out to this uh british comedian youtuber and was able to get him to do a freestyle rap about C++,
which is pretty impressive because this guy, as he says in the video, knows absolutely nothing
about programming, definitely knows nothing about C++, but he did a pretty good job making a
relevant on-the-fly rap about it. Yeah, it's pretty funny. Yeah, I don't have much to say.
Just go watch the video. It's funny.
I did have a chance to take a look at it
and I was impressed. Anything that's impromptu
like that, that he's not a
subject domain expert in
and getting that much nuance
out of just a little bit of Googling,
my hat's off to him. He's talented.
And just for the record, Chris Turner,
as far as I know, no relation to me.
I wonder if they're going to play this at the start of the prague meeting maybe oh i'll be entertaining
okay and then the next thing we have is uh
from bartek's blog and this is a c++ 20 reference card
and this is pretty neat um if you sign up for his mailing list you can
download it and it has a pretty good concise reference for all the new language and library
features being added in c++ 20 that can fit on a page um pretty cool that he was able to put this
together and you can get it for free yeah these kind of things are pretty neat and i also had an
opportunity to look at that and i i thought it was a very handy way of at least knowing what your options are. And then if you
need to know something more, you can always explore deeper. But just having the index is
fantastic, is great. Yeah, and he's put one out for C++ 17 as well. So if you need a refresher
on, you know, what's been added over the past four years you can look at both he's
constantly producing little downloadable things like this and yeah you have to sign up for his
mailing list but not that big of a deal really yeah free mailing list okay so john as we mentioned
in your bio um you just released large scale c++ volume one process and architecture um and you
mentioned that you had your original book uh
sepalsplit or logic as it was a software design is the new book kind of an update to the first one
or is it all new material so that's a good question and the answer is a little bit of both
but not really so what i mean it is it has to be because there's nothing in the old book that I would take back, particularly.
I mean, I did.
I will.
All right.
I take it back.
There's one place back in the day where I had a diagram and I had a string and a word and I used derivation to show as a sort of an off-the-cuff dependency example, and I wrongly, and I'll never live it down,
wrongly derived word from string, saying that a word is a string that has less, more restrictions
on itself. And of course, because of slicing and all the other things that we all know are just
terrible, I got just, I was never forgiven. There was a gentleman, I'll just say his first name,
Dat, who made it very clear to me many times forever that I just didn't know what I was
talking about. And I quickly conceded, yep, yep, yep. And oh my goodness, it doesn't matter.
There's no excuse for putting something in that's wrong to show something that's right.
So I take full responsibility for that.
And now that's off my chest and everything else I stand by.
So this new book, oh, and there's one other story I can tell you.
When I wrote that book, I have to give credit to John Waite at Addison Wesley at the time
because the reason I even wrote that other book is I was working at Columbia and someone called me or I got a flyer in the mail saying, do I want to get a desk copy of this other book that was a good book?
I don't remember which one it was, but it was one good book of the day.
And I said, absolutely.
And so I let them know.
And they said, well, how do we know that you work at Columbia?
And I was like, wait, wait, you sent me the desk references.
And then I got a little, you know, as my friend Sean would say, howly.
And I said, do you realize I'm even thinking of writing a book?
And it would be on large scale C++ and blah, blah, blah.
The next thing I know, John Waite flies out to see me, you know, and next thing I know, I'm writing a book.
So that's, that's
actually in the 1996 book that started in 93. And what I wrote about in that book, they wanted me to
write a book called large skill C++ software development. And I didn't know about development,
I was a kid of 34. You know, how would I know anything about development, but I did know about
design, or at least I thought I did, except for that one thing that i got wrong uh never happened again not in a million years but um i
so i started writing this and it did take me about 14 months to come up with the first draft
but then there were some appendices that i felt were absolutely essential to get right
and that took another you know maybe another year and then finally doing all the figures there were
like 400 figures in that book and so that wound up taking another year and then finally doing all the figures there were like 400 figures in
that book and so that wound up taking another year so by the time i finally published it was july of
2000 no 1996 and uh so anyway that's lake coast 96 and uh you know i'm i'm good with that but that
book was was not a complete treatment of c++ software design. It was really an architecture book.
This book that you have here is part of a three-volume series. It's the first part. I wrote
the third volume first, which is Verification and Testing, because 20 years ago, 21 years ago,
when I started writing this book, I knew a whole lot about testing that I didn't get to put in the first book.
So I wrote the third volume first, chapters 7 through 10.
I literally wrote them.
I have them.
I published chapter 7 at C++ Connections in 2004.
And I've been thinking about publishing this like every year since.
But it just never – anyway, that happened.
Finally, in about 2011, someone joined Bloomberg.
His name is Jeffrey Olkin, and he is the – oh, I'm jumping ahead.
Prior to this, I needed somebody to help me do structural editing on the book because the book was looking like it was on the order of 2500 pages and i needed just somebody else to bounce ideas off of and pearson was very um generous and hired a structural editor and the structural editor read the manuscript and says you know there's a lot of
stuff here unfortunately even though he was a phd in computer science and this was his job
he just said i i just i don't know i can't i can't help you
so anyway now back on fast forward to 2011 when this fellow jeffrey olkin uh joined bloomberg
and he's very important to me because uh back um back then i i was i was very much into helping
people write contracts we might talk about that later. And I saw a paper written by somebody
who was a manager of another group. And I was reading it and it was okay, but it wasn't quite
stellar. Then all of a sudden around, I don't know, maybe it was page 13 or something,
it became perfect English, just perfect. And I went over and I said, what happened here?
Oh yeah, I started this and I gave it to Jeffrey Olkin. I said, where's Jeffrey Olkin?
So I found Jeffrey Olkin.
Shortly after that, he became the structural editor of my book.
And he read every word of this book and helped me rewrite it.
So I owe him a ton as being my ghost sort of co-author.
He didn't actually come up with the words and ideas per se but he helped me structure
in a way that was coherent so that i i mean that was really huge um so many thanks to jeffrey
elkin so what's this book this book right here this this volume that i'm touching is the physical
design part of the three volume set so the first volume is Process and Architecture, and it talks about the same kinds of things
that the original book talked about.
But the original book had three parts to it as well.
It had sort of a preparation,
then it had a notion of what physical design is,
and then it had a little bit more on logical design.
And the first book is known
for the physical design contribution.
And this book is all, all, I tell you,
all physical design. And then what's really nice about this is the final chapter, which is rather intimidating. It's about half the book
is how to apply physical design in large scale software development. So that's what this book is
in a nutshell-ish.h okay so you've mentioned
the first book and you told us that you already have the third book written but you as far as i
know didn't say anything at all about what the second volume is going to be well the second
volume is called design and implementation so it's got a fairly wide um category chapter four
really talks about some stuff that that you just have to know it and no one
really talks about it. So the value of a value is the first section of chapter four. Chapter four
is really about interfaces and such. And the next section is classifying classes because I felt like
being cute. The third one is value semantics and the fourth one is vocabulary types. And those titles have been there since the
1990s. They're just known and I use them routinely. And I thought I understood value semantics pretty
well back in 2001. Then right around 2008, an interesting story, I was in the process of
getting something that's also near and dear to my heart, allocators properly into the standard.
I know that's a touchy subject, and I do want to touch on it later.
But it was pushed back, and I had to wait for the next meeting.
I was in one of those big rooms with no windows in Hawaii, and they were talking about 25 men in t-shirts with laptops.
And they were talking about a defect.
And they were talking about a defect because in the wording of the standard
where they said that these two pattern objects were the same.
And people were going, well, what do you mean the same?
Well, you know, the same that they they
well maybe they compare equal oh um they're what would happen if a copy construct you know there
was no wording there was no way to say what they meant by the same literally someone finally said
things like well they're equal they're equivalent you know what i mean but no one knew what they
meant did it mean the same object did it mean that there was something about the two objects that was the same?
What was it?
And it turned out that what the two objects had that was the same is the same value.
Well, what do we mean by value?
And now we need to understand what salient attributes are.
And so while I was there, I vowed because I was just, you was just really upset that my proposal didn't go through.
And I didn't say anything in the meeting because I said, these guys, come on, please.
I said I'm going to write a presentation, which I then presented at ACCU shortly thereafter.
But it got me started.
And it was right around the time that Alex Stepanoff wrote his book on the elements of programming style.
This is a true story.
I got the book, Elements of Programming Style, and I started to read it, and I couldn't understand it.
I put it in my briefcase, and I left it there.
For years, I carried it around, and I had no idea what it said, nothing.
I just couldn't.
I carried it with me.
Then finally, I was asked to interview Alex Stepanoff on his new book from Mathematics to Generic Programming.
And I said, damn it.
All right.
I got to go back and read the book.
So I couldn't read the book.
So I photocopied the pages and made the print really big.
And I got past page two and discovered that chapter one of his book is all about value semantics.
And I'm like, oh, my my god if he just said so anyway what's really
wonderful is that the first chapter of alex's book and the entire thesis that i have on value
semantics is within you know a micron of the same thing and where they differ is so sort of nebulous
and unimportant i mean it kind of it's because he comes from a more mathematical
background and I come from a more software engineering background, but we backed into it
the same thing, pretty much the same thing, exactly the same thing. And it's very heartening
to know that if you come at something from two different ways and you get the same answer,
you're probably right. And that's the beauty of testing. If you
can test something in a way that's completely different than the way you designed it,
then you have a lot of confidence that it's right. So there's a lot of goodness in that.
So anyway, where was I? I think we're talking about what was in the second book.
Okay. So in the second book, the notion of value semantics is covered.
And towards the end of it, I mean, there's a lot about interfaces.
And towards the end of it, which is the end of Chapter 4, there is a dedicated section on memory allocation.
And that's because memory allocation needs to be part of the interface of objects.
And this is a contentious issue. And a lot of people can fairly say
with a library-based solution for memory allocation,
it's ugly, it's expensive to write,
it's expensive to maintain.
There's a lot of clutter in the interface.
And if you're not one of the power users
of memory allocation, you're not a games person,
you're not at Wall Street or in a hedge fund
or you're trying to do low latency work
or you're not doing something in an embedded system, if you're not one of those guys,
but you just want to use C++ because it's a cool language, it's a pain. It was not usable in C++
03 because there were weasel words that said basically it doesn't have to work in c++ 11 it still invaded the type system which is why vocabulary
types and templatized policies don't work well together and so that was that in c++ 17 we got
the polymorphic memory resource which my good buddy pablo that used to work for me and then
went to intel and now works for me is in charge of, but it's not good enough
because as much good as memory allocation brings to performance and as much good ancillary
good that it does for being able to instrument at scale anything in production and being
able to place out whole objects in special memory and rapid prototyping and reduced risk
and even garbage collection in an arena all of those
things that you can do that's not quite good enough why because the language solution doesn't
allow interoperability with modern c++ aggregate initialization or compiler generated constructors
and assignment and so in order to get what you want A, have to write it all yourself and get it right.
And B, you can't use it in fun ways like lambdas and so on.
And so it's kind of eh.
But if you're not interested in the cool parts of C++ and all you want to do is get your job done, it's indispensable. now imagine that the language itself made your classes allocator aware and they didn't show up
in the constructors at all so all you have to do is just say i want to create this vector using
this allocator done and you never even knew that you wrote the class that was allocator aware
because there's no allocators in it and it's kind of the same as making a function virtual. If you make it virtual, do you see all
those pointers to functions that the compiler put in? No, you just know that it's polymorphic.
God bless. So if we at Bloomberg, by the way, on this particular podcast, I'm speaking for me,
but when I'm at the standards committee, I speak for Bloomberg. I'm the voting
member. But I want to tell you that, for real, if I'm successful and this actually comes to pass,
all of the allocators that we've been using at Bloomberg for 20 years, the library-based
allocators that we could not live without would come out. We'd take them out and use the language.
And that's a strong statement.
Without allocators in the language, we will continue to maintain them ourselves out of necessity.
They're that important.
Wow.
Okay.
I want to interrupt the discussion for just a moment to bring you a word from our sponsors.
Backtrace is the only cross-platform crash and exception reporting solution that automates all the manual work needed to capture, symbolicate, dedupe, classify,
prioritize, and investigate crashes in one interface. Backtrace customers reduce engineering
team time spent on figuring out what crashed, why, and whether it even matters by half or more.
At the time of error, Backtrace jumps into action, capturing detailed dumps of app environmental
state. It then analyzes process memory and executable code to classify errors and highlight important signals such as heap corruption, malware, and much more.
Whether you work on Linux, Windows, mobile, or gaming platforms, Backtrace can take pain out of crash handling.
Check out their new Visual Studio extension for C++ developers.
Companies like Fastly, Amazon, and Comcast use Backtrace to improve software stability.
It's free to try, minutes to set up, with no commitment necessary. Check them out at backtrace.io slash cppcast.
So you mentioned the three different volumes. The first volume is already out,
and your original book came out, you know, 20-some years ago. I'm assuming we're not going to wait
20 years for the second and third volume. It sounds like they're mostly written already.
So now I get to talk about future books real quick. So there's the volume two and volume
three. People joke that I'll be 100 years old by the time volume three comes out. Well, they don't
know Laurie. Laurie Hughes is someone that I had the pleasure of working with in producing volume
one. I knew she was talented because she was able to render three-dimensional things. She is, I knew she was talented
because she was able to render three-dimensional things.
She's a technical writer.
She's also project manager.
And she got me to get this book out this year.
I think there's no other human being that could have done it.
And she was merciless.
And I'm not easy to work with,
especially when I get no sleep for five months.
So it was challenging.
But she and I together managed to get me to get it done.
She has already put into Pearson for a schedule for the second and third books.
I mean, it's not going to be done in a few months because the stuff is old and needs
to be updated.
But the sections are already forward referenced from the first volume.
That's how,
yes, seriously, that's crazy. So that doesn't normally happen. We're forward referencing books that don't in theory exist, but they do. So two more volumes are coming. If I had to give
it an honest guess before Laurie, I would have said in the next 10 years. With Laurie, I'd say
in the next five. Now, if I beat that in the next five now if i beat that that's
great but let's be real in the meantime i have three co-authors on three separate books uh
vittoria romeo who works at bloomberg and i are putting out something based on a paper that we
wrote that was approved by the was was was seen and blessed by many senior people in the standards
committee and we call the paper embracing modern c++ safely and it's called the em you know emc squared paper but basically what it is is it took
c++ 11 only after many many years like seven years and said here's c++ in 10 pages in 10 you know or
11 pages uh c++ 11 language features.
Here's what they are, and here's what to be careful about.
And they're sorted into three different categories.
And this is just to annoy people.
Safe features, conditionally safe features, and unsafe features.
And there are only two unsafe features, and it's tongue-in-cheek.
There's nothing unsafe about c++
11 what's unsafe is if you have 6 000 people at a company like bloomberg and you start doing um
member references and you don't know why okay or you start making things final and you don't have
a company-wide policy that says we have control over all software we can always unfinalize it
whenever we need to.
If you just ask us, take us two seconds.
Well, a Google, a place like Google can do that.
But because Bloomberg actually opens sources and makes things public,
we don't have control over everybody that uses us.
And so we have to make more open, like the standard, open decisions.
And so for us to put final on something inhibits one of
the critical feces of the book and of the new book, which is hierarchical reuse. Hierarchical
reuse is all about reuse, where you're not just saying reuse this logger or reuse this
matching engine or something. Whenever you build something like that,
you build it out of smaller pieces.
And the smaller pieces themselves
are useful for building similar things.
Those smaller pieces in turn
are built out of smaller pieces.
And even a vector,
when you look under the covers,
anybody who's building the standard library
has a whole library underneath
of hierarchically reusable pieces
that they use for all their containers,
all the way down to metaprogramming.
And so what I'm advocating is make fine-grain modular pieces that are stable
and resort to hiding only when the interface between what you're exposing
and what you're hiding is unstable.
So having a private class or a nested class, something like that in a component,
which is just a.h.cpp pair, is something that you do because of instability and not because of sloppiness, not because you don't want to test it, none of that.
There's also, you don't put two classes in the same component unless you have a reason.
Again, a component is like a module.
It's like a.h.cpp pair.
And in fact, modules are better components.
But if you don't have modules, you have components.
And the way you design a module and the way you design a component are the same.
And this volume heavily footnotes, when we have modules widely in use, how you would
do exactly what this book says using modules.
But the design is no different.
There's no change. So if you're interested in modules exactly what this book says using modules, but the design is no different.
There's no change.
So if you're interested in modules, buy this book. And just to be perfectly clear,
we are talking about C++ 20 feature modules.
Yes, yes.
And the C++ 20 feature modules was based on what I call a component
that's been around for 25 years.
It's the same thing in many ways.
The differences are are are important for example
in legos 96 i said no design cycles no physical cyclic physical dependencies in modules as a
result of my 2017 standards paper there are no cyclic dependencies possible in modules
except if you do some i don't even want to tell you how to
do it i think i think it's either not possible or it's so arcane that you have to be at the
fifth degree black belt level to do it that's good enough if you really want to shoot yourself in the
foot you might be it's not the foot this would be in the head the other one is long distance
friendship the two most important design imperatives that I can give anybody is no physical cycles, no long distance friendship. Modules do not permit long distance friendship.
What that means is you cannot have something in one module say that something in another module
is a friend. And that means that you don't have this incredible coupling that would make Parnas
turnover, you know, because Parnas really liked modular programming.
Dijkstra really liked no dependencies. And so those two people can rest in peace.
I think you just challenged probably at least a dozen of our listeners, though,
to figure out how to create cyclic dependencies and modules.
Tell me and I'll try to stamp it out in the next release. It is something that is horrible. You
should not do it if you do figure it out um another thing
that modules bring to the table that's really good is it gives you control over what we call
transitive includes and an example of where module is just awesome it stops you you can you can have
a type that's used in the implementation of a type that's public let's say it's a data member
of the type an example would be i have a box The box has two points as data members. Okay. I don't want to publish point because I've
got a better point on the way. So this point is just temporary, right? But I don't want people
to use my point, but I want them to use my box. In old fashioned header files, I'd have to make the
point type available and they could create instances on their own. With modules, we can
stop that from happening. That means that if you need it, you have to include it yourself. That's a rule. That's
a great rule because if you don't do that, somebody can change their implementation and client code
can stop working for no apparent reason. So the reason we want to do this is we want to say,
if you want to use point, you have to import it it yourself and if we don't want you to import it you won't be importing it done and that's a great thing that
gives you the encapsulation you need the control over encapsulation what modules don't give you
because the modules are obviously the best thing since sliced bread what modules do not give you
is what's called insulation in the first book we, we talk about insulation is I can insulate
an implementation detail and clients, when I change it in my library, clients of the library
don't even have to recompile. Whereas if it's encapsulated, they might have to recompile,
but they don't have to rework their code. So modules do wonderful things for encapsulation.
They do nothing at all, period, period for installation and if you want that to
happen it isn't going to happen now we could take modules a step further and we could make them
fine-grained views and what that means is suppose you write a stack class or something and and i'm
giving this only as a toy example and you know push is working great but only the real experts
are allowed to use pop
because we've been working on that for a while we're not sure you could provide a client with
a view that treats the stack class as the same type but is enabled for only push but not pop
you could then give another client full access you could give yet another client access for pop
but not push then you could have a client using
all three and they would inherit and take the union of the capabilities of each of the people
they import or the units they import and so at the very end of the day there'd be a lot of different
people accessing the same type there's there's no odr issue there's nothing like that and one more
thing we'll mention because we're we're in early stages of this, you can imagine that it's not the same thing as hiding the functionality,
because someday you might expose it. So it's treated much more like an inheritance hierarchy
and overloading. If you have a deep inheritance hierarchy, which we don't recommend, and you have
a private type at the bottom that's a better match than the public type at the top, you'll match the private type and then be told no. And the same kind of thinking would apply to a
view on a subsystem. You'd be told, yeah, you'd love to use this, wouldn't you? Yeah, call the
company and see if you can get it because right now we're not going to let you screw that up.
We're not going to make it so that when we give you a different view, your code changes behavior
because somebody could reasonably argue that that's dangerous. So we won't screw that up. We're not going to make it so that when we give you a different view, your code changes behavior, because somebody could reasonably argue that that's dangerous.
So we won't let that happen.
Well, this topic leads me to think about the current ongoing discussion. And I know you're
getting ready to hand to the standards meeting. So if you don't want to talk about this too much
right now, that's fine. But breaking abi okay uh uh where do you have thoughts
on that like you're you're at bloomberg you're at giant organization with lots of code you just
talked about the fact that you open source a lot of stuff uh and you you have to take decisions um
you can't make decisions lightly in your design but if you need to ABI in your code, is that a good thing?
Are you good at recompiling? What direction do you think the standard should take on this
as this conversation is evolving? Yeah, I think that's a really good question. I think that
what we've done is we've become such a slave to it that it can be really crippling. I mean,
you can imagine the other side of the coin
where we where we changed it all the time and people could also be very frustrated my own
personal experience is that we we have um old platforms like sun and because of sun it's kind
of like we can't do anything fun because we have to stick with stuff from really like 1998 C++.
We have to stick with really, really old features.
And in a way, it's like we don't have any options.
We can't use PMR.
So we have to stick with our own version of allocators that were PMR 20 years before we had PMR.
And we can't get off of that.
So I'm just – I don't know what i mean
there are two sides of the coin there really are um what i would like to see honestly um once we
perfect our allocator technology so that it's language-based and if we can sell that to the
standard i would like to see the issue of sdl2 uh revisit you know what i'm saying a revisit a new sdl that has all of the
mistakes of the last 30 because i'm looking ahead 30 years removed and then you say okay i'm either
on or std or or std2 you just say and and that is a that is a a fork but it's a clean one time because I won't be alive for STD3.
So it's a clean one time.
And the advantage of doing it after 20 years just because of ranges was probably not enough.
But if you have ranges, modules, and no more allocators in the code, yet fully handled,
and now you have a new way of doing business and you can get rid of
all of the warts that came from the way we over specified certain vector things and whatever i
mean having having certain operations have the have the um strong guarantee just turns people
on their heads for no really good reason people were a little premature to promise more than they
needed to and so you know when you have node-based containers that have embedded sentinels and and all the whole issue of of
radioactive types after moves oh moves are another thing let me just say this just to annoy people
move isn't all that and if you if you don't use allocators you'll get memory diffusion
and your code will slow down over time and if you're not careful with. You'll get memory diffusion and your code will slow down over time. And if you're not careful with moves, you'll get memory diffusion and your code will slow
down over time.
So when you're writing libraries, be really careful.
If you're copying stuff, move is an optimization.
And if it's reasonable to do it, you do it.
And if it's not reasonable to do it because you're going across a memory boundary, then
you'll copy and you'll be better off for it on the other end if you're not copying right now and you say but i can make it
move stop think of what you're doing you're you're hurting yourself you're shooting yourself in the
foot you're making your code incredibly hard to test for absolutely no reason we have unique
pointer and that is a a separate piece of functionality now i have talked to many
many people on the standards committee and asked them to tell me i'm wrong they don't tell me i'm
wrong so move only types i'm going on the record right now is saying there are about 11 of them
the first and foremost is unique pointer there are a handful of descriptor like things that have active destructors file handles sockets shared memory handles um thread handles and that's it and
they're all descriptors in an operating system and you don't need anything else because as soon
as you're allocating memory just use unique pointer and call it a day otherwise you're going
to pay a heck of a lot for something you don't want to use and you don't need to use.
And now magic trick, right?
In C++03, you could use an undeclared copy constructor and in place value, return by value, something that isn't even copyable or movable as a factory function.
C++11, you can do that same trick with move in c plus plus 17 you don't need to use that trick anymore as long as you return the type as an pr value which is just
composing it and returning it i'm going to propose in 23 that we do a little named rvo so that we can
create an object beat on it return it and now factory functions don't need copy or move at all
and they don't need to cheat by the way the cheat that i told you is safe because it'll either work
or it won't link i'm not doing something dangerous right okay just to make sure everybody knows so
i've the back the cat's out of the bag move only types uh-uh somebody tell me i'm wrong and give
me give me a counter example i want it i I want it. I'm asking. Seriously.
I'm sure some listener will take you up on that.
Maybe. Bring it.
I mean that in the nicest,
most cordial possible way because
if I'm wrong, I want to know it. I've asked
everybody I can. How should they deliver
their example to you?
Let me ask you. That's a good question.
How about, I don't know
if it's appropriate, but I will tell you this, if you're that determined, go to the standard website and
look up any of my papers and you'll find my email on one of the papers. How's that?
Okay. That sounds good. And I think many of our listeners would already know where to find that.
If you're that good that you know where to find it, then I want to hear from you.
Okay. You know, we've been talking about how you've been working on this book over the past
20 years.
We just spent some time talking about modules and,
and move semantics.
How has the evolution of C plus plus affected what's gone into this book,
you know,
with C plus plus 11,
14,
17.
Sure.
So,
so again,
this book shaped C plus plus 11, 14 and 17, not the other way around. I'm not kidding. I've been in the standards committee for a long time, and I've been feeding this stuff there. This book didn't change. Modules change to reflect components. development has not changed at all when it comes to designing the components, packages,
and package groups that make up an enormous system. Now, move semantics is important.
Allocators are important. All of these things are very important in volume two, which will not...
Volume one was written to the C++98 subset of C++20. That's another way of saying it was written in 98.
But there are tons of footnotes that address language features that could be used if you have a higher level.
But you don't need any of them to design at this level because we're talking about macro pieces.
It wouldn't even matter what language you were in hardly, let alone what version of C++.
The reason I chose what I did is I want architecture to be approachable by anybody who knows C++. And you're not going to design differently at that level. You're just not.
You're not going to have different dependencies. So that's it. I will say, however, that the
evolution of my knowledge of generic programming and callbacks whereas i
talked about levelization techniques in chapter 5 of lacos 96 levelization techniques is section 3.5
of lacos 20 and um that is a in in in the callback section there are now five subsections
which talk about different kinds of callbacks data data callbacks, function callbacks, functor callbacks, virtual callbacks, and
generic callbacks.
And generic callbacks are crazy powerful things.
And please keep in mind that you can use virtual functions and get polymorphism that way.
You can use templates and get polymorphism that way.
They are similar and they are different.
And the book explains how they're similar and different. And the book says it in much the same
way that Matt Austern did back in 2000. But Matt Austern, of course, was much more avant-garde than
I ever was about templates. So now I understand. So the book will make it clear at that level as
well. So did I answer your question? No change, no change
at all for volume one, volume two. Yeah, we'll talk the same underlying concepts because it's
software engineering. The other thing I want to say is this book is not a book on C++, it's a book
on software engineering. And that is probably the most important difference between most books.
If you get a book on C++, it'll teach you about C++.
If you get a book from me, it's about software engineering using C++, and it's a very,
very different beast. It comes from a very different place, and its longevity is much,
much greater. LACOS 96 is still correct. So then would you say this book is relevant to
and has concepts that are meaningful to people who don't use C++ and are working on large-scale projects?
Absolutely. Absolutely. 100% translatable to Java. Any programming language that has separate compilation units, this is the right book. You just won't be able to take full advantage of the nuanced control that you
get from C++. The only missing piece of nuanced control right now is having memory allocation
be part of on the per object interface. That's the last, that's the last, the holy grail,
the last thing that separates C++ from the hardware is you might need to get between C++
and the hardware for placing objects in memory.
That's the last place. And this will take that away. This will remove it. And now C++ will be
lacquered to the hardware. So I'm curious, you've talked a lot about this proposal that you said,
we'll remove allocators from the interface, but still make them available. What does that,
what does it actually look like? What would the users? So may I make a reference? If you want to go to
Halpern and notice it's Alistair and Pablo, Alistair, Meredith and Pablo Halpern, CPP Con
2019, getting allocators out of our way. Okay. There's an entire hour that talks about what
we're talking about. So again, C++, or sorry, CppCon 2019, Meredith and Halpern, or Alistair
and Pablo, I don't know which it is, getting allocators out of our way. It's as simple as this.
When I go to take a simple type that is an allocating type, I have to add an extra constructor,
an extra argument to each constructor to make sure that i can add an
optional allocator that allocator is stored in the object appointed to the allocator and then
and then used to allocate memory most mostly right now um people don't use it and people might say
well that's extra overhead and whatever please look at any of my 2019 talks that's not true
uh it isn't extra over it's not measurable please look at any of my 2019 talks that's not true uh it isn't extra over it's
not measurable please look at my 2017 talks particularly my meeting c++ 2017 talk on
allocators to see just how powerful they are we're not talking a little bit of power we're talking
um in certain build up and tear down situations just how fast the allocator works when it's not encumbered by
synchronization can give you a factor of four out of the box. We're talking about long-lived
programs. The actual allocation algorithm doesn't matter. It's the locality and memory
that's devastating. And if you don't use local allocators, you diffuse memory across all of virtual memory.
And then what happens is you get a low density of your working set per page.
And then all of your caches and your physical memory are not doing what the machine was designed to do.
And so I really want to emphasize that allocators give you and allow you to get and keep performance.
And they're super- duper important for that.
But they have many other benefits.
And when you start to weigh cost benefit,
the only real cost is not to the user of allocator aware software.
The only cost is to the software infrastructure group that implements it for them.
And that cost is real.
However, if it were done by the compiler so all you did was write
the class and it were very much like virtual functions where the compiler figures out you
know how to manage the the allocator and when to use and all of that and all you have to tell the
object when it's constructed outside of the signature you just and i'm making up the syntax
don't hold me to this you just say what you would have said before,
using and then an allocator.
Just give it the name of the allocator
and you're done and nothing's different.
And now all the constructor overloads are gone.
All the documentation boilerplate is gone.
No one needs to know that they're writing
allocator aware software, even in infrastructure.
And now it's usable and interoperable
with all modern C++ generated constructors, generated in infrastructure, and now it's usable and interoperable with all modern C++
generated constructors, generated assignment operators, aggregate initialization, lambdas,
you name it. It's all good. If that happens, there is no cost-benefit analysis. There's nothing to
talk about. We're done. I just want to emphasize, go look at that talk. Give me support. Bloomberg is
supporting me. Everybody should support me. If I get this done, no one will ever need to see an
allocator again. Gone. Okay. I'm going to put the link to this talk in the show notes for anyone who
wants to watch it. Please. I strongly encourage it. It's just in its infancy right now. It's nascent.
I've been told by my management I have to have a working demo
of this compiler by the end of this year. Okay. So you mentioned offhanded contracts
earlier in the interview. Oh, that's the fun one.
What's that? That's the fun one.
Yeah. They were added and then removed from C++20. Are you working towards a future direction for
standardization of contracts right
now? Okay, well, as I said, one of the books, one of the three co-authored books, I don't know if I
got to all three. I gave one. No, you didn't. Yeah. Let me tell you the second one. The third one is
actually Rostislav Kalebnikov and I are co-authoring a book on contracts. We're going to be publishing in the same publisher. Target date for that is about two years. But we're also writing a series of papers.
The thing about contracts that I think is difficult is of all the engineering that we do,
contracts fall into that gray area. I'm going to finish up on contracts, but I just want to mention Joshua Burney and I are
writing a book on allocators for the working programmer so that we can show you why they're
good without being too pedantic.
So that said, let me get back to contracts.
So contracts, all of these three co-authored books are in a two-year time frame, just letting
you know.
Okay.
And we need them because we need people to realize how important they are and when contracts are
available and they i'm pretty sure they will be in 23 we want people to take full advantage so the
thing about contracts are your if you're doing contracts right if everything is right you're
having absolutely no effect on behavior which is a really bizarre thing to say if
everything is good because a contract check is a defensive check what's a defensive check
what are you defending against you're stating that this thing that i'm checking cannot happen
cannot no way i put my sign i can't happen. So the simplest version of this is I write a very elaborate averaging.
I want to get the mid-range of two integers.
Now, to do that properly, I have to do some funny stuff.
I can't just say A plus B over 2 because I could overflow.
So I have to do some funky stuff like A over 2 plus B over 2 plus uh what is it is a much you know you have to
do that long kind of funny expanded thing you could get that wrong now the right way to do this
is using test drivers but another way to do it is to write an assertion right after that that says
as long as the numbers are small it better be the same as A plus B over two, right? So you
could say if abs A is less than a thousand and abs B is less than a thousand, then it is A plus B
over two. There's no overflow. You could do that. And that would be a redundant check. You don't
need to write that if you've tested your program. So anytime you write a check by inspection locally,
it's shrink-wrapped. You can't get it wrong, it's redundant, you can turn it off and the logic of the program is the same.
The only thing that changes is performance.
The next step on that is to say I'm going to have a precondition check.
A precondition check is a defensive check.
What are you defending against?
You're defending against misuse by trusted clients in your shrink-wrapped program.
So when the program is all said and done and compiled and linked, nobody can change the
calls.
The library is hooked up to the app.
It's one program.
And then those checks that you have to make sure that the thing wasn't called out of contract
are redundant.
Now, once you run this program for long enough and none of those checks fire,
you say, I'm good. You turn off the checks and the program runs faster and life is good because the program is mature. That is a defensive check. Now let's look at the other side. What is input
validation? Input validation. I have a config file or I have a customer data coming over the wire. Do I trust
the customer? Hell no. Customer is a trading system. So I'm not going to defensively check
against that the data is valid. I'm going to validate it in every build mode forever because
no matter how good my program is, any new customer comes along could kill me.
That's input validation.
Now you might say, well, wait a minute.
What if it's my own config file?
I wrote it myself.
It's good.
Life is good.
Now the question becomes trickier.
Who's putting the config file there?
Is it shrink wrapped into a container or is it, you know, Joe put it in today
and somebody released this thing tomorrow
and they're inconsistent.
So the wisdom is, unless it's shrink-wrapped together by a dent of being compiled program
or you have some kind of module container something package where you deploy the whole
thing at once after it's been tested and it cannot come unglued, then the check is an
input validation check, not a defensive check. But if you can absolutely guarantee that it's
been shrink-wrapped and tested, then the internal defensive checks can be removed and there's no
change in behavior and no risk of it's getting out of sync. So that's where contracts are hard.
And certain people on the standards committee do not understand that a program with no contracts,
a program with some contracts, and a program with all contracts enabled are the same program,
assuming that the program is correct.
And if the program isn't correct, then having a contract that checks and continues is no
worse and infinitely better than not having a contract at all.
There's a misunderstanding of what's going on there and that misunderstanding, because it's not
easy to appreciate if you're not a real day-to-day software engineer, that is what derailed contracts.
I will fix it. I promise you, I will fix it. In fact, I think everybody is getting to understand that
contracts are a different beast and that if you write contracts properly, they have no effect on
a correct program and they don't defend against anything with the correct program. They don't
defend against bad input with a correct program. They do nothing with a correct program. If your
program is incorrect, it means you didn't test it. You should go back and test it. But once it's all working, leave the contracts effective for
a little while, make yourself happy, then turn them off and buy less hardware. Okay.
Well, I'm looking forward to contracts. It was one of the things I was looking forward to the
most for C++20 because even forget the design or any question about the design, any tool that
helps us validate software, I'm good with uh and let me
let me just point out that there are three reasons for having uh contracts uh i believe um one of
them what was there there's sort of three different camps one of them was was making software um uh
safer i think that was was one of the one of things. And one of them is making software faster,
because once you have assertions in place and you know they're true, not checking them is faster
than checking them, but assuming that they're true and letting the compiler optimize based on them
is even more. And so runtime checks, awesome. Compile time checks for static analysis, awesome.
Speeding up runtime code based on optimization for the compiler, interesting.
All three of them together for free because it's a Trojan horse.
If you have contracts in place, anybody can do static analysis.
Right.
All right.
Well, John, I think we're running out of time, so we'll let you go.
But thank you so much for coming on today, and I encourage listeners to check out the book.
Well, I thank you very much for having me, and I think I managed to touch on everything I was hoping to touch on.
And thanks for asking about contracts.
We're going to do our best to get them in as soon as possible to 23.
And let's not wait till 26. I want everybody who's interested in getting them in by 23 to write
your congressman or write your language designer and say, yes, we want contracts in 23.
Okay.
Okay. Sounds good. Thanks, John.
All right. Take care.
Thanks.
Bye.
Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the
podcast. Please let us know if we're discussing the stuff you're interested in,
or if you have a suggestion for a topic, we'd love to hear about that too.
You can email all your thoughts to feedback at cppcast.com.
We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter.
You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter.
We'd also like to thank all our patrons who help support the show through Patreon. You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter.
We'd also like to thank all our patrons who help support the show through Patreon.
If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast.
And of course, you can find all that info and the show notes on the podcast website at cppcast.com.
Theme music for this episode was provided by podcastthemes.com.