CppCast - Large Scale C++

Starting point is 00:00:00 Thank you. Check out their new Visual Studio extension for C++ and claim your free trial at backtrace.io. In this episode, we talk about freestyle rapping and C++ reference cards. Then we talk to John Lakers. John talks to us about his new book, the first podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how's it going today? I'm all right, Rob. How are you doing? Doing fine. You had a little announcement to make, right? Yeah, so I've got two classes that are coming up in Stuttgart at the end of September and

Starting point is 00:01:42 right into the first day of Octoberober there which um are on writing correct performant code and taking advantage of compile time constexpr templates um topics so i just suggest that our listeners go and check those out we'll link to those in the notes anyone can sign up for them yeah we'll definitely put those in the show notes And are you going to Stuttgart just for these courses, or did you already have plans to be there for a conference or something? No, that's just for these courses. Oh, wow. That's great. Okay, well, at the top of our episode, I'd like to create a piece of feedback.

Starting point is 00:02:15 We got this comment on the website, and unfortunately we got a couple of similar comments on Reddit. So this one was from Matt Dawson saying, very excited about this topic but audio quality for the guest was very poor and i eventually gave up i guess this is the exception which proves the rule as the audio quality on cbpcast is usually great um so yeah i i really want to apologize for the poor audio quality uh both apologize to our listeners and to the guest because you know listeners weren't able to really hear,

Starting point is 00:02:45 uh, what Vadim had to say. Um, we were just talking before the show, how maybe we'll invite him on to kind of do a redo. We've never done a redo before, but, uh,

Starting point is 00:02:55 we have not, but it would be nice to have a demon, uh, you know, make sure he has some, some better audio equipment beforehand. And, uh, yeah,

Starting point is 00:03:02 we'll, we'll make sure that we, we don't let that sort of thing happen again. And, uh, yeah, we'll, we'll make sure that we, we don't let that sort of thing happen again. Um, we kind of both thought at the beginning of the interview that, yeah, this isn't great, but we we've scrubbed, you know, somewhat poor audio before, but it just was a lot worse than we thought it was. Yeah. And it kind of seemed like the quality was coming and going a little bit more like when he was doing the majority of the talking, it seemed like the audio was from my perspective better so i i thought that it would be something that we could clean up and the listener would never really notice but um yeah yeah that's not

Starting point is 00:03:33 how it worked out yeah unfortunately not so yeah we will definitely keep this in mind and try to be better uh apologize you know again to all listeners who weren't able to stick it through the whole episode because of the quality though well we'd love to hear your thoughts about the show you can always reach out to us on facebook twitter or email us at feedback at supercast.com and don't forget to leave us a review on itunes or subscribe on youtube joining us today is john lakos john is author of the large scale c++ software design serves at bloomberg lp in new york city as a senior architect and mentor for c++ software development worldwide he's also an active voting member of the C++ Standards Committee's Evolution Working Group. Previously, Dr. Lakos directed the design and development of infrastructure libraries

Starting point is 00:04:14 for proprietary analytic financial applications at Bear Stearns. For 12 years prior, Dr. Lakos developed large frameworks and advanced ICCAD applications at Mentor Graphics, where she holds multiple software patents. His academic credentials include a PhD in computer science and a SCD in electrical engineering from Columbia University. Dr. Lakos received his undergraduate degree from MIT in mathematics and computer science. His new book, the first volume of which is entitled Large-Scale C++ Volume 1 Process and Architecture, is now available from Pearson Education. John, welcome to the show. Well, thank you very much for having me.

Starting point is 00:04:49 It's a pleasure to be here. I have not, I don't believe, ever heard anything about your history with Mentor Graphics in the past. I'm kind of curious what you worked on there, if you can mention more about it. Be happy to. It's been, geez, it's been 23 years since I left Mentor Graphics. And I published my first book while I was at Mentor Graphics based on the knowledge that I gained from working there. I was there 11 years. And we worked on very large scale at the time, software systems.

Starting point is 00:05:19 And what we worked on is CAD tools. We had routers. We had visual editors. And my particular contribution was a, I'm going to say it's a device and wire extraction and recognition system so that you could take layout that was just handmade layout, extract all the transistors and the contacts and whatever. So you get all the devices out. Then the second patent is to extract the wires using all-angle geometry, and I took advantage of vector calculus to make that super quick. And in fact, being able to extract a 200-segment wire is something that seems like a really hard problem to people. And what's fantastic is that the early part of this book that I just wrote captures the

Starting point is 00:06:07 power of what's called dynamic programming or memoization, so that even though it seems like an exponentially hard problem, it's really a very simple quadratic problem if you get it right. And so, again, using that wonderful technique, I was able to do very, very large problems in what seemed like an impossibly short time. And it was quite successful. We could extract entire semi-standard libraries. We could extract them, retarget them to a different technology, and then put them out. What would take a year could be done in a few days.

Starting point is 00:06:43 It was pretty spectacular. So interestingly, it sounds like, if I might read a little bit between the lines, that the work that you did on graphics early on, even though it seems completely unrelated to the financial industry today, is still helping you. What you learned there is helping you today still. Well, when we say mentor graphics, graphics is, I mean, it's really cad right it's really computer-aided design and yes there's some there's some graphical rendering and whatever but it's more like it's more it's hard geometry and math and and and compaction and and algorithms it's it's it's not quite graphics i just want to be clear it's not the front-end kind of graphics, it's the back-end kind of graphics. And yes, the back-end kind of graphics is right with very interesting, very hard problems

Starting point is 00:07:32 that require heuristic solutions. Well, John, we've got a couple news articles to discuss. Feel free to comment on any of these, and then we'll start talking more about your book, and maybe a little bit about the Prague ISO meeting next week, sure thing okay so this first one is um freestyle c++ rap uh victor zverevich who we have had on the show before to talk about format lib i guess reached out to this uh british comedian youtuber and was able to get him to do a freestyle rap about C++, which is pretty impressive because this guy, as he says in the video, knows absolutely nothing about programming, definitely knows nothing about C++, but he did a pretty good job making a

Starting point is 00:08:18 relevant on-the-fly rap about it. Yeah, it's pretty funny. Yeah, I don't have much to say. Just go watch the video. It's funny. I did have a chance to take a look at it and I was impressed. Anything that's impromptu like that, that he's not a subject domain expert in and getting that much nuance out of just a little bit of Googling,

Starting point is 00:08:38 my hat's off to him. He's talented. And just for the record, Chris Turner, as far as I know, no relation to me. I wonder if they're going to play this at the start of the prague meeting maybe oh i'll be entertaining okay and then the next thing we have is uh from bartek's blog and this is a c++ 20 reference card and this is pretty neat um if you sign up for his mailing list you can download it and it has a pretty good concise reference for all the new language and library

Starting point is 00:09:12 features being added in c++ 20 that can fit on a page um pretty cool that he was able to put this together and you can get it for free yeah these kind of things are pretty neat and i also had an opportunity to look at that and i i thought it was a very handy way of at least knowing what your options are. And then if you need to know something more, you can always explore deeper. But just having the index is fantastic, is great. Yeah, and he's put one out for C++ 17 as well. So if you need a refresher on, you know, what's been added over the past four years you can look at both he's constantly producing little downloadable things like this and yeah you have to sign up for his mailing list but not that big of a deal really yeah free mailing list okay so john as we mentioned

Starting point is 00:09:55 in your bio um you just released large scale c++ volume one process and architecture um and you mentioned that you had your original book uh sepalsplit or logic as it was a software design is the new book kind of an update to the first one or is it all new material so that's a good question and the answer is a little bit of both but not really so what i mean it is it has to be because there's nothing in the old book that I would take back, particularly. I mean, I did. I will. All right.

Starting point is 00:10:29 I take it back. There's one place back in the day where I had a diagram and I had a string and a word and I used derivation to show as a sort of an off-the-cuff dependency example, and I wrongly, and I'll never live it down, wrongly derived word from string, saying that a word is a string that has less, more restrictions on itself. And of course, because of slicing and all the other things that we all know are just terrible, I got just, I was never forgiven. There was a gentleman, I'll just say his first name, Dat, who made it very clear to me many times forever that I just didn't know what I was talking about. And I quickly conceded, yep, yep, yep. And oh my goodness, it doesn't matter. There's no excuse for putting something in that's wrong to show something that's right.

Starting point is 00:11:23 So I take full responsibility for that. And now that's off my chest and everything else I stand by. So this new book, oh, and there's one other story I can tell you. When I wrote that book, I have to give credit to John Waite at Addison Wesley at the time because the reason I even wrote that other book is I was working at Columbia and someone called me or I got a flyer in the mail saying, do I want to get a desk copy of this other book that was a good book? I don't remember which one it was, but it was one good book of the day. And I said, absolutely. And so I let them know.

Starting point is 00:12:00 And they said, well, how do we know that you work at Columbia? And I was like, wait, wait, you sent me the desk references. And then I got a little, you know, as my friend Sean would say, howly. And I said, do you realize I'm even thinking of writing a book? And it would be on large scale C++ and blah, blah, blah. The next thing I know, John Waite flies out to see me, you know, and next thing I know, I'm writing a book. So that's, that's actually in the 1996 book that started in 93. And what I wrote about in that book, they wanted me to

Starting point is 00:12:30 write a book called large skill C++ software development. And I didn't know about development, I was a kid of 34. You know, how would I know anything about development, but I did know about design, or at least I thought I did, except for that one thing that i got wrong uh never happened again not in a million years but um i so i started writing this and it did take me about 14 months to come up with the first draft but then there were some appendices that i felt were absolutely essential to get right and that took another you know maybe another year and then finally doing all the figures there were like 400 figures in that book and so that wound up taking another year and then finally doing all the figures there were like 400 figures in that book and so that wound up taking another year so by the time i finally published it was july of

Starting point is 00:13:09 2000 no 1996 and uh so anyway that's lake coast 96 and uh you know i'm i'm good with that but that book was was not a complete treatment of c++ software design. It was really an architecture book. This book that you have here is part of a three-volume series. It's the first part. I wrote the third volume first, which is Verification and Testing, because 20 years ago, 21 years ago, when I started writing this book, I knew a whole lot about testing that I didn't get to put in the first book. So I wrote the third volume first, chapters 7 through 10. I literally wrote them. I have them.

Starting point is 00:13:51 I published chapter 7 at C++ Connections in 2004. And I've been thinking about publishing this like every year since. But it just never – anyway, that happened. Finally, in about 2011, someone joined Bloomberg. His name is Jeffrey Olkin, and he is the – oh, I'm jumping ahead. Prior to this, I needed somebody to help me do structural editing on the book because the book was looking like it was on the order of 2500 pages and i needed just somebody else to bounce ideas off of and pearson was very um generous and hired a structural editor and the structural editor read the manuscript and says you know there's a lot of stuff here unfortunately even though he was a phd in computer science and this was his job he just said i i just i don't know i can't i can't help you

Starting point is 00:14:45 so anyway now back on fast forward to 2011 when this fellow jeffrey olkin uh joined bloomberg and he's very important to me because uh back um back then i i was i was very much into helping people write contracts we might talk about that later. And I saw a paper written by somebody who was a manager of another group. And I was reading it and it was okay, but it wasn't quite stellar. Then all of a sudden around, I don't know, maybe it was page 13 or something, it became perfect English, just perfect. And I went over and I said, what happened here? Oh yeah, I started this and I gave it to Jeffrey Olkin. I said, where's Jeffrey Olkin? So I found Jeffrey Olkin.

Starting point is 00:15:27 Shortly after that, he became the structural editor of my book. And he read every word of this book and helped me rewrite it. So I owe him a ton as being my ghost sort of co-author. He didn't actually come up with the words and ideas per se but he helped me structure in a way that was coherent so that i i mean that was really huge um so many thanks to jeffrey elkin so what's this book this book right here this this volume that i'm touching is the physical design part of the three volume set so the first volume is Process and Architecture, and it talks about the same kinds of things that the original book talked about.

Starting point is 00:16:08 But the original book had three parts to it as well. It had sort of a preparation, then it had a notion of what physical design is, and then it had a little bit more on logical design. And the first book is known for the physical design contribution. And this book is all, all, I tell you, all physical design. And then what's really nice about this is the final chapter, which is rather intimidating. It's about half the book

Starting point is 00:16:33 is how to apply physical design in large scale software development. So that's what this book is in a nutshell-ish.h okay so you've mentioned the first book and you told us that you already have the third book written but you as far as i know didn't say anything at all about what the second volume is going to be well the second volume is called design and implementation so it's got a fairly wide um category chapter four really talks about some stuff that that you just have to know it and no one really talks about it. So the value of a value is the first section of chapter four. Chapter four is really about interfaces and such. And the next section is classifying classes because I felt like

Starting point is 00:17:18 being cute. The third one is value semantics and the fourth one is vocabulary types. And those titles have been there since the 1990s. They're just known and I use them routinely. And I thought I understood value semantics pretty well back in 2001. Then right around 2008, an interesting story, I was in the process of getting something that's also near and dear to my heart, allocators properly into the standard. I know that's a touchy subject, and I do want to touch on it later. But it was pushed back, and I had to wait for the next meeting. I was in one of those big rooms with no windows in Hawaii, and they were talking about 25 men in t-shirts with laptops. And they were talking about a defect.

Starting point is 00:18:08 And they were talking about a defect because in the wording of the standard where they said that these two pattern objects were the same. And people were going, well, what do you mean the same? Well, you know, the same that they they well maybe they compare equal oh um they're what would happen if a copy construct you know there was no wording there was no way to say what they meant by the same literally someone finally said things like well they're equal they're equivalent you know what i mean but no one knew what they meant did it mean the same object did it mean that there was something about the two objects that was the same?

Starting point is 00:18:47 What was it? And it turned out that what the two objects had that was the same is the same value. Well, what do we mean by value? And now we need to understand what salient attributes are. And so while I was there, I vowed because I was just, you was just really upset that my proposal didn't go through. And I didn't say anything in the meeting because I said, these guys, come on, please. I said I'm going to write a presentation, which I then presented at ACCU shortly thereafter. But it got me started.

Starting point is 00:19:18 And it was right around the time that Alex Stepanoff wrote his book on the elements of programming style. This is a true story. I got the book, Elements of Programming Style, and I started to read it, and I couldn't understand it. I put it in my briefcase, and I left it there. For years, I carried it around, and I had no idea what it said, nothing. I just couldn't. I carried it with me. Then finally, I was asked to interview Alex Stepanoff on his new book from Mathematics to Generic Programming.

Starting point is 00:19:47 And I said, damn it. All right. I got to go back and read the book. So I couldn't read the book. So I photocopied the pages and made the print really big. And I got past page two and discovered that chapter one of his book is all about value semantics. And I'm like, oh, my my god if he just said so anyway what's really wonderful is that the first chapter of alex's book and the entire thesis that i have on value

Starting point is 00:20:12 semantics is within you know a micron of the same thing and where they differ is so sort of nebulous and unimportant i mean it kind of it's because he comes from a more mathematical background and I come from a more software engineering background, but we backed into it the same thing, pretty much the same thing, exactly the same thing. And it's very heartening to know that if you come at something from two different ways and you get the same answer, you're probably right. And that's the beauty of testing. If you can test something in a way that's completely different than the way you designed it, then you have a lot of confidence that it's right. So there's a lot of goodness in that.

Starting point is 00:20:52 So anyway, where was I? I think we're talking about what was in the second book. Okay. So in the second book, the notion of value semantics is covered. And towards the end of it, I mean, there's a lot about interfaces. And towards the end of it, which is the end of Chapter 4, there is a dedicated section on memory allocation. And that's because memory allocation needs to be part of the interface of objects. And this is a contentious issue. And a lot of people can fairly say with a library-based solution for memory allocation, it's ugly, it's expensive to write,

Starting point is 00:21:32 it's expensive to maintain. There's a lot of clutter in the interface. And if you're not one of the power users of memory allocation, you're not a games person, you're not at Wall Street or in a hedge fund or you're trying to do low latency work or you're not doing something in an embedded system, if you're not one of those guys, but you just want to use C++ because it's a cool language, it's a pain. It was not usable in C++

Starting point is 00:21:56 03 because there were weasel words that said basically it doesn't have to work in c++ 11 it still invaded the type system which is why vocabulary types and templatized policies don't work well together and so that was that in c++ 17 we got the polymorphic memory resource which my good buddy pablo that used to work for me and then went to intel and now works for me is in charge of, but it's not good enough because as much good as memory allocation brings to performance and as much good ancillary good that it does for being able to instrument at scale anything in production and being able to place out whole objects in special memory and rapid prototyping and reduced risk and even garbage collection in an arena all of those

Starting point is 00:22:47 things that you can do that's not quite good enough why because the language solution doesn't allow interoperability with modern c++ aggregate initialization or compiler generated constructors and assignment and so in order to get what you want A, have to write it all yourself and get it right. And B, you can't use it in fun ways like lambdas and so on. And so it's kind of eh. But if you're not interested in the cool parts of C++ and all you want to do is get your job done, it's indispensable. now imagine that the language itself made your classes allocator aware and they didn't show up in the constructors at all so all you have to do is just say i want to create this vector using this allocator done and you never even knew that you wrote the class that was allocator aware

Starting point is 00:23:42 because there's no allocators in it and it's kind of the same as making a function virtual. If you make it virtual, do you see all those pointers to functions that the compiler put in? No, you just know that it's polymorphic. God bless. So if we at Bloomberg, by the way, on this particular podcast, I'm speaking for me, but when I'm at the standards committee, I speak for Bloomberg. I'm the voting member. But I want to tell you that, for real, if I'm successful and this actually comes to pass, all of the allocators that we've been using at Bloomberg for 20 years, the library-based allocators that we could not live without would come out. We'd take them out and use the language. And that's a strong statement.

Starting point is 00:24:25 Without allocators in the language, we will continue to maintain them ourselves out of necessity. They're that important. Wow. Okay. I want to interrupt the discussion for just a moment to bring you a word from our sponsors. Backtrace is the only cross-platform crash and exception reporting solution that automates all the manual work needed to capture, symbolicate, dedupe, classify, prioritize, and investigate crashes in one interface. Backtrace customers reduce engineering team time spent on figuring out what crashed, why, and whether it even matters by half or more.

Starting point is 00:24:55 At the time of error, Backtrace jumps into action, capturing detailed dumps of app environmental state. It then analyzes process memory and executable code to classify errors and highlight important signals such as heap corruption, malware, and much more. Whether you work on Linux, Windows, mobile, or gaming platforms, Backtrace can take pain out of crash handling. Check out their new Visual Studio extension for C++ developers. Companies like Fastly, Amazon, and Comcast use Backtrace to improve software stability. It's free to try, minutes to set up, with no commitment necessary. Check them out at backtrace.io slash cppcast. So you mentioned the three different volumes. The first volume is already out, and your original book came out, you know, 20-some years ago. I'm assuming we're not going to wait

Starting point is 00:25:40 20 years for the second and third volume. It sounds like they're mostly written already. So now I get to talk about future books real quick. So there's the volume two and volume three. People joke that I'll be 100 years old by the time volume three comes out. Well, they don't know Laurie. Laurie Hughes is someone that I had the pleasure of working with in producing volume one. I knew she was talented because she was able to render three-dimensional things. She is, I knew she was talented because she was able to render three-dimensional things. She's a technical writer. She's also project manager.

Starting point is 00:26:12 And she got me to get this book out this year. I think there's no other human being that could have done it. And she was merciless. And I'm not easy to work with, especially when I get no sleep for five months. So it was challenging. But she and I together managed to get me to get it done. She has already put into Pearson for a schedule for the second and third books.

Starting point is 00:26:35 I mean, it's not going to be done in a few months because the stuff is old and needs to be updated. But the sections are already forward referenced from the first volume. That's how, yes, seriously, that's crazy. So that doesn't normally happen. We're forward referencing books that don't in theory exist, but they do. So two more volumes are coming. If I had to give it an honest guess before Laurie, I would have said in the next 10 years. With Laurie, I'd say in the next five. Now, if I beat that in the next five now if i beat that that's great but let's be real in the meantime i have three co-authors on three separate books uh

Starting point is 00:27:12 vittoria romeo who works at bloomberg and i are putting out something based on a paper that we wrote that was approved by the was was was seen and blessed by many senior people in the standards committee and we call the paper embracing modern c++ safely and it's called the em you know emc squared paper but basically what it is is it took c++ 11 only after many many years like seven years and said here's c++ in 10 pages in 10 you know or 11 pages uh c++ 11 language features. Here's what they are, and here's what to be careful about. And they're sorted into three different categories. And this is just to annoy people.

Starting point is 00:27:55 Safe features, conditionally safe features, and unsafe features. And there are only two unsafe features, and it's tongue-in-cheek. There's nothing unsafe about c++ 11 what's unsafe is if you have 6 000 people at a company like bloomberg and you start doing um member references and you don't know why okay or you start making things final and you don't have a company-wide policy that says we have control over all software we can always unfinalize it whenever we need to. If you just ask us, take us two seconds.

Starting point is 00:28:28 Well, a Google, a place like Google can do that. But because Bloomberg actually opens sources and makes things public, we don't have control over everybody that uses us. And so we have to make more open, like the standard, open decisions. And so for us to put final on something inhibits one of the critical feces of the book and of the new book, which is hierarchical reuse. Hierarchical reuse is all about reuse, where you're not just saying reuse this logger or reuse this matching engine or something. Whenever you build something like that,

Starting point is 00:29:05 you build it out of smaller pieces. And the smaller pieces themselves are useful for building similar things. Those smaller pieces in turn are built out of smaller pieces. And even a vector, when you look under the covers, anybody who's building the standard library

Starting point is 00:29:18 has a whole library underneath of hierarchically reusable pieces that they use for all their containers, all the way down to metaprogramming. And so what I'm advocating is make fine-grain modular pieces that are stable and resort to hiding only when the interface between what you're exposing and what you're hiding is unstable. So having a private class or a nested class, something like that in a component,

Starting point is 00:29:44 which is just a.h.cpp pair, is something that you do because of instability and not because of sloppiness, not because you don't want to test it, none of that. There's also, you don't put two classes in the same component unless you have a reason. Again, a component is like a module. It's like a.h.cpp pair. And in fact, modules are better components. But if you don't have modules, you have components. And the way you design a module and the way you design a component are the same. And this volume heavily footnotes, when we have modules widely in use, how you would

Starting point is 00:30:20 do exactly what this book says using modules. But the design is no different. There's no change. So if you're interested in modules exactly what this book says using modules, but the design is no different. There's no change. So if you're interested in modules, buy this book. And just to be perfectly clear, we are talking about C++ 20 feature modules. Yes, yes. And the C++ 20 feature modules was based on what I call a component

Starting point is 00:30:38 that's been around for 25 years. It's the same thing in many ways. The differences are are are important for example in legos 96 i said no design cycles no physical cyclic physical dependencies in modules as a result of my 2017 standards paper there are no cyclic dependencies possible in modules except if you do some i don't even want to tell you how to do it i think i think it's either not possible or it's so arcane that you have to be at the fifth degree black belt level to do it that's good enough if you really want to shoot yourself in the

Starting point is 00:31:15 foot you might be it's not the foot this would be in the head the other one is long distance friendship the two most important design imperatives that I can give anybody is no physical cycles, no long distance friendship. Modules do not permit long distance friendship. What that means is you cannot have something in one module say that something in another module is a friend. And that means that you don't have this incredible coupling that would make Parnas turnover, you know, because Parnas really liked modular programming. Dijkstra really liked no dependencies. And so those two people can rest in peace. I think you just challenged probably at least a dozen of our listeners, though, to figure out how to create cyclic dependencies and modules.

Starting point is 00:31:57 Tell me and I'll try to stamp it out in the next release. It is something that is horrible. You should not do it if you do figure it out um another thing that modules bring to the table that's really good is it gives you control over what we call transitive includes and an example of where module is just awesome it stops you you can you can have a type that's used in the implementation of a type that's public let's say it's a data member of the type an example would be i have a box The box has two points as data members. Okay. I don't want to publish point because I've got a better point on the way. So this point is just temporary, right? But I don't want people to use my point, but I want them to use my box. In old fashioned header files, I'd have to make the

Starting point is 00:32:39 point type available and they could create instances on their own. With modules, we can stop that from happening. That means that if you need it, you have to include it yourself. That's a rule. That's a great rule because if you don't do that, somebody can change their implementation and client code can stop working for no apparent reason. So the reason we want to do this is we want to say, if you want to use point, you have to import it it yourself and if we don't want you to import it you won't be importing it done and that's a great thing that gives you the encapsulation you need the control over encapsulation what modules don't give you because the modules are obviously the best thing since sliced bread what modules do not give you is what's called insulation in the first book we, we talk about insulation is I can insulate

Starting point is 00:33:27 an implementation detail and clients, when I change it in my library, clients of the library don't even have to recompile. Whereas if it's encapsulated, they might have to recompile, but they don't have to rework their code. So modules do wonderful things for encapsulation. They do nothing at all, period, period for installation and if you want that to happen it isn't going to happen now we could take modules a step further and we could make them fine-grained views and what that means is suppose you write a stack class or something and and i'm giving this only as a toy example and you know push is working great but only the real experts are allowed to use pop

Starting point is 00:34:05 because we've been working on that for a while we're not sure you could provide a client with a view that treats the stack class as the same type but is enabled for only push but not pop you could then give another client full access you could give yet another client access for pop but not push then you could have a client using all three and they would inherit and take the union of the capabilities of each of the people they import or the units they import and so at the very end of the day there'd be a lot of different people accessing the same type there's there's no odr issue there's nothing like that and one more thing we'll mention because we're we're in early stages of this, you can imagine that it's not the same thing as hiding the functionality,

Starting point is 00:34:50 because someday you might expose it. So it's treated much more like an inheritance hierarchy and overloading. If you have a deep inheritance hierarchy, which we don't recommend, and you have a private type at the bottom that's a better match than the public type at the top, you'll match the private type and then be told no. And the same kind of thinking would apply to a view on a subsystem. You'd be told, yeah, you'd love to use this, wouldn't you? Yeah, call the company and see if you can get it because right now we're not going to let you screw that up. We're not going to make it so that when we give you a different view, your code changes behavior because somebody could reasonably argue that that's dangerous. So we won't screw that up. We're not going to make it so that when we give you a different view, your code changes behavior, because somebody could reasonably argue that that's dangerous. So we won't let that happen.

Starting point is 00:35:28 Well, this topic leads me to think about the current ongoing discussion. And I know you're getting ready to hand to the standards meeting. So if you don't want to talk about this too much right now, that's fine. But breaking abi okay uh uh where do you have thoughts on that like you're you're at bloomberg you're at giant organization with lots of code you just talked about the fact that you open source a lot of stuff uh and you you have to take decisions um you can't make decisions lightly in your design but if you need to ABI in your code, is that a good thing? Are you good at recompiling? What direction do you think the standard should take on this as this conversation is evolving? Yeah, I think that's a really good question. I think that

Starting point is 00:36:16 what we've done is we've become such a slave to it that it can be really crippling. I mean, you can imagine the other side of the coin where we where we changed it all the time and people could also be very frustrated my own personal experience is that we we have um old platforms like sun and because of sun it's kind of like we can't do anything fun because we have to stick with stuff from really like 1998 C++. We have to stick with really, really old features. And in a way, it's like we don't have any options. We can't use PMR.

Starting point is 00:36:53 So we have to stick with our own version of allocators that were PMR 20 years before we had PMR. And we can't get off of that. So I'm just – I don't know what i mean there are two sides of the coin there really are um what i would like to see honestly um once we perfect our allocator technology so that it's language-based and if we can sell that to the standard i would like to see the issue of sdl2 uh revisit you know what i'm saying a revisit a new sdl that has all of the mistakes of the last 30 because i'm looking ahead 30 years removed and then you say okay i'm either on or std or or std2 you just say and and that is a that is a a fork but it's a clean one time because I won't be alive for STD3.

Starting point is 00:37:48 So it's a clean one time. And the advantage of doing it after 20 years just because of ranges was probably not enough. But if you have ranges, modules, and no more allocators in the code, yet fully handled, and now you have a new way of doing business and you can get rid of all of the warts that came from the way we over specified certain vector things and whatever i mean having having certain operations have the have the um strong guarantee just turns people on their heads for no really good reason people were a little premature to promise more than they needed to and so you know when you have node-based containers that have embedded sentinels and and all the whole issue of of

Starting point is 00:38:30 radioactive types after moves oh moves are another thing let me just say this just to annoy people move isn't all that and if you if you don't use allocators you'll get memory diffusion and your code will slow down over time and if you're not careful with. You'll get memory diffusion and your code will slow down over time. And if you're not careful with moves, you'll get memory diffusion and your code will slow down over time. So when you're writing libraries, be really careful. If you're copying stuff, move is an optimization. And if it's reasonable to do it, you do it. And if it's not reasonable to do it because you're going across a memory boundary, then

Starting point is 00:39:02 you'll copy and you'll be better off for it on the other end if you're not copying right now and you say but i can make it move stop think of what you're doing you're you're hurting yourself you're shooting yourself in the foot you're making your code incredibly hard to test for absolutely no reason we have unique pointer and that is a a separate piece of functionality now i have talked to many many people on the standards committee and asked them to tell me i'm wrong they don't tell me i'm wrong so move only types i'm going on the record right now is saying there are about 11 of them the first and foremost is unique pointer there are a handful of descriptor like things that have active destructors file handles sockets shared memory handles um thread handles and that's it and they're all descriptors in an operating system and you don't need anything else because as soon

Starting point is 00:39:56 as you're allocating memory just use unique pointer and call it a day otherwise you're going to pay a heck of a lot for something you don't want to use and you don't need to use. And now magic trick, right? In C++03, you could use an undeclared copy constructor and in place value, return by value, something that isn't even copyable or movable as a factory function. C++11, you can do that same trick with move in c plus plus 17 you don't need to use that trick anymore as long as you return the type as an pr value which is just composing it and returning it i'm going to propose in 23 that we do a little named rvo so that we can create an object beat on it return it and now factory functions don't need copy or move at all and they don't need to cheat by the way the cheat that i told you is safe because it'll either work

Starting point is 00:40:50 or it won't link i'm not doing something dangerous right okay just to make sure everybody knows so i've the back the cat's out of the bag move only types uh-uh somebody tell me i'm wrong and give me give me a counter example i want it i I want it. I'm asking. Seriously. I'm sure some listener will take you up on that. Maybe. Bring it. I mean that in the nicest, most cordial possible way because if I'm wrong, I want to know it. I've asked

Starting point is 00:41:15 everybody I can. How should they deliver their example to you? Let me ask you. That's a good question. How about, I don't know if it's appropriate, but I will tell you this, if you're that determined, go to the standard website and look up any of my papers and you'll find my email on one of the papers. How's that? Okay. That sounds good. And I think many of our listeners would already know where to find that. If you're that good that you know where to find it, then I want to hear from you.

Starting point is 00:41:41 Okay. You know, we've been talking about how you've been working on this book over the past 20 years. We just spent some time talking about modules and, and move semantics. How has the evolution of C plus plus affected what's gone into this book, you know, with C plus plus 11, 14,

Starting point is 00:42:00 17. Sure. So, so again, this book shaped C plus plus 11, 14 and 17, not the other way around. I'm not kidding. I've been in the standards committee for a long time, and I've been feeding this stuff there. This book didn't change. Modules change to reflect components. development has not changed at all when it comes to designing the components, packages, and package groups that make up an enormous system. Now, move semantics is important. Allocators are important. All of these things are very important in volume two, which will not... Volume one was written to the C++98 subset of C++20. That's another way of saying it was written in 98.

Starting point is 00:42:46 But there are tons of footnotes that address language features that could be used if you have a higher level. But you don't need any of them to design at this level because we're talking about macro pieces. It wouldn't even matter what language you were in hardly, let alone what version of C++. The reason I chose what I did is I want architecture to be approachable by anybody who knows C++. And you're not going to design differently at that level. You're just not. You're not going to have different dependencies. So that's it. I will say, however, that the evolution of my knowledge of generic programming and callbacks whereas i talked about levelization techniques in chapter 5 of lacos 96 levelization techniques is section 3.5 of lacos 20 and um that is a in in in the callback section there are now five subsections

Starting point is 00:43:42 which talk about different kinds of callbacks data data callbacks, function callbacks, functor callbacks, virtual callbacks, and generic callbacks. And generic callbacks are crazy powerful things. And please keep in mind that you can use virtual functions and get polymorphism that way. You can use templates and get polymorphism that way. They are similar and they are different. And the book explains how they're similar and different. And the book says it in much the same way that Matt Austern did back in 2000. But Matt Austern, of course, was much more avant-garde than

Starting point is 00:44:16 I ever was about templates. So now I understand. So the book will make it clear at that level as well. So did I answer your question? No change, no change at all for volume one, volume two. Yeah, we'll talk the same underlying concepts because it's software engineering. The other thing I want to say is this book is not a book on C++, it's a book on software engineering. And that is probably the most important difference between most books. If you get a book on C++, it'll teach you about C++. If you get a book from me, it's about software engineering using C++, and it's a very, very different beast. It comes from a very different place, and its longevity is much,

Starting point is 00:44:57 much greater. LACOS 96 is still correct. So then would you say this book is relevant to and has concepts that are meaningful to people who don't use C++ and are working on large-scale projects? Absolutely. Absolutely. 100% translatable to Java. Any programming language that has separate compilation units, this is the right book. You just won't be able to take full advantage of the nuanced control that you get from C++. The only missing piece of nuanced control right now is having memory allocation be part of on the per object interface. That's the last, that's the last, the holy grail, the last thing that separates C++ from the hardware is you might need to get between C++ and the hardware for placing objects in memory. That's the last place. And this will take that away. This will remove it. And now C++ will be

Starting point is 00:45:51 lacquered to the hardware. So I'm curious, you've talked a lot about this proposal that you said, we'll remove allocators from the interface, but still make them available. What does that, what does it actually look like? What would the users? So may I make a reference? If you want to go to Halpern and notice it's Alistair and Pablo, Alistair, Meredith and Pablo Halpern, CPP Con 2019, getting allocators out of our way. Okay. There's an entire hour that talks about what we're talking about. So again, C++, or sorry, CppCon 2019, Meredith and Halpern, or Alistair and Pablo, I don't know which it is, getting allocators out of our way. It's as simple as this. When I go to take a simple type that is an allocating type, I have to add an extra constructor,

Starting point is 00:46:42 an extra argument to each constructor to make sure that i can add an optional allocator that allocator is stored in the object appointed to the allocator and then and then used to allocate memory most mostly right now um people don't use it and people might say well that's extra overhead and whatever please look at any of my 2019 talks that's not true uh it isn't extra over it's not measurable please look at any of my 2019 talks that's not true uh it isn't extra over it's not measurable please look at my 2017 talks particularly my meeting c++ 2017 talk on allocators to see just how powerful they are we're not talking a little bit of power we're talking um in certain build up and tear down situations just how fast the allocator works when it's not encumbered by

Starting point is 00:47:26 synchronization can give you a factor of four out of the box. We're talking about long-lived programs. The actual allocation algorithm doesn't matter. It's the locality and memory that's devastating. And if you don't use local allocators, you diffuse memory across all of virtual memory. And then what happens is you get a low density of your working set per page. And then all of your caches and your physical memory are not doing what the machine was designed to do. And so I really want to emphasize that allocators give you and allow you to get and keep performance. And they're super- duper important for that. But they have many other benefits.

Starting point is 00:48:09 And when you start to weigh cost benefit, the only real cost is not to the user of allocator aware software. The only cost is to the software infrastructure group that implements it for them. And that cost is real. However, if it were done by the compiler so all you did was write the class and it were very much like virtual functions where the compiler figures out you know how to manage the the allocator and when to use and all of that and all you have to tell the object when it's constructed outside of the signature you just and i'm making up the syntax

Starting point is 00:48:43 don't hold me to this you just say what you would have said before, using and then an allocator. Just give it the name of the allocator and you're done and nothing's different. And now all the constructor overloads are gone. All the documentation boilerplate is gone. No one needs to know that they're writing allocator aware software, even in infrastructure.

Starting point is 00:49:02 And now it's usable and interoperable with all modern C++ generated constructors, generated in infrastructure, and now it's usable and interoperable with all modern C++ generated constructors, generated assignment operators, aggregate initialization, lambdas, you name it. It's all good. If that happens, there is no cost-benefit analysis. There's nothing to talk about. We're done. I just want to emphasize, go look at that talk. Give me support. Bloomberg is supporting me. Everybody should support me. If I get this done, no one will ever need to see an allocator again. Gone. Okay. I'm going to put the link to this talk in the show notes for anyone who wants to watch it. Please. I strongly encourage it. It's just in its infancy right now. It's nascent.

Starting point is 00:49:42 I've been told by my management I have to have a working demo of this compiler by the end of this year. Okay. So you mentioned offhanded contracts earlier in the interview. Oh, that's the fun one. What's that? That's the fun one. Yeah. They were added and then removed from C++20. Are you working towards a future direction for standardization of contracts right now? Okay, well, as I said, one of the books, one of the three co-authored books, I don't know if I got to all three. I gave one. No, you didn't. Yeah. Let me tell you the second one. The third one is

Starting point is 00:50:17 actually Rostislav Kalebnikov and I are co-authoring a book on contracts. We're going to be publishing in the same publisher. Target date for that is about two years. But we're also writing a series of papers. The thing about contracts that I think is difficult is of all the engineering that we do, contracts fall into that gray area. I'm going to finish up on contracts, but I just want to mention Joshua Burney and I are writing a book on allocators for the working programmer so that we can show you why they're good without being too pedantic. So that said, let me get back to contracts. So contracts, all of these three co-authored books are in a two-year time frame, just letting you know.

Starting point is 00:51:04 Okay. And we need them because we need people to realize how important they are and when contracts are available and they i'm pretty sure they will be in 23 we want people to take full advantage so the thing about contracts are your if you're doing contracts right if everything is right you're having absolutely no effect on behavior which is a really bizarre thing to say if everything is good because a contract check is a defensive check what's a defensive check what are you defending against you're stating that this thing that i'm checking cannot happen cannot no way i put my sign i can't happen. So the simplest version of this is I write a very elaborate averaging.

Starting point is 00:51:49 I want to get the mid-range of two integers. Now, to do that properly, I have to do some funny stuff. I can't just say A plus B over 2 because I could overflow. So I have to do some funky stuff like A over 2 plus B over 2 plus uh what is it is a much you know you have to do that long kind of funny expanded thing you could get that wrong now the right way to do this is using test drivers but another way to do it is to write an assertion right after that that says as long as the numbers are small it better be the same as A plus B over two, right? So you could say if abs A is less than a thousand and abs B is less than a thousand, then it is A plus B

Starting point is 00:52:33 over two. There's no overflow. You could do that. And that would be a redundant check. You don't need to write that if you've tested your program. So anytime you write a check by inspection locally, it's shrink-wrapped. You can't get it wrong, it's redundant, you can turn it off and the logic of the program is the same. The only thing that changes is performance. The next step on that is to say I'm going to have a precondition check. A precondition check is a defensive check. What are you defending against? You're defending against misuse by trusted clients in your shrink-wrapped program.

Starting point is 00:53:06 So when the program is all said and done and compiled and linked, nobody can change the calls. The library is hooked up to the app. It's one program. And then those checks that you have to make sure that the thing wasn't called out of contract are redundant. Now, once you run this program for long enough and none of those checks fire, you say, I'm good. You turn off the checks and the program runs faster and life is good because the program is mature. That is a defensive check. Now let's look at the other side. What is input

Starting point is 00:53:38 validation? Input validation. I have a config file or I have a customer data coming over the wire. Do I trust the customer? Hell no. Customer is a trading system. So I'm not going to defensively check against that the data is valid. I'm going to validate it in every build mode forever because no matter how good my program is, any new customer comes along could kill me. That's input validation. Now you might say, well, wait a minute. What if it's my own config file? I wrote it myself.

Starting point is 00:54:11 It's good. Life is good. Now the question becomes trickier. Who's putting the config file there? Is it shrink wrapped into a container or is it, you know, Joe put it in today and somebody released this thing tomorrow and they're inconsistent. So the wisdom is, unless it's shrink-wrapped together by a dent of being compiled program

Starting point is 00:54:32 or you have some kind of module container something package where you deploy the whole thing at once after it's been tested and it cannot come unglued, then the check is an input validation check, not a defensive check. But if you can absolutely guarantee that it's been shrink-wrapped and tested, then the internal defensive checks can be removed and there's no change in behavior and no risk of it's getting out of sync. So that's where contracts are hard. And certain people on the standards committee do not understand that a program with no contracts, a program with some contracts, and a program with all contracts enabled are the same program, assuming that the program is correct.

Starting point is 00:55:16 And if the program isn't correct, then having a contract that checks and continues is no worse and infinitely better than not having a contract at all. There's a misunderstanding of what's going on there and that misunderstanding, because it's not easy to appreciate if you're not a real day-to-day software engineer, that is what derailed contracts. I will fix it. I promise you, I will fix it. In fact, I think everybody is getting to understand that contracts are a different beast and that if you write contracts properly, they have no effect on a correct program and they don't defend against anything with the correct program. They don't defend against bad input with a correct program. They do nothing with a correct program. If your

Starting point is 00:56:01 program is incorrect, it means you didn't test it. You should go back and test it. But once it's all working, leave the contracts effective for a little while, make yourself happy, then turn them off and buy less hardware. Okay. Well, I'm looking forward to contracts. It was one of the things I was looking forward to the most for C++20 because even forget the design or any question about the design, any tool that helps us validate software, I'm good with uh and let me let me just point out that there are three reasons for having uh contracts uh i believe um one of them what was there there's sort of three different camps one of them was was making software um uh safer i think that was was one of the one of things. And one of them is making software faster,

Starting point is 00:56:47 because once you have assertions in place and you know they're true, not checking them is faster than checking them, but assuming that they're true and letting the compiler optimize based on them is even more. And so runtime checks, awesome. Compile time checks for static analysis, awesome. Speeding up runtime code based on optimization for the compiler, interesting. All three of them together for free because it's a Trojan horse. If you have contracts in place, anybody can do static analysis. Right. All right.

Starting point is 00:57:23 Well, John, I think we're running out of time, so we'll let you go. But thank you so much for coming on today, and I encourage listeners to check out the book. Well, I thank you very much for having me, and I think I managed to touch on everything I was hoping to touch on. And thanks for asking about contracts. We're going to do our best to get them in as soon as possible to 23. And let's not wait till 26. I want everybody who's interested in getting them in by 23 to write your congressman or write your language designer and say, yes, we want contracts in 23. Okay.

Starting point is 00:57:55 Okay. Sounds good. Thanks, John. All right. Take care. Thanks. Bye. Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, we'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com.

Starting point is 00:58:15 We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help support the show through Patreon. You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help support the show through Patreon. If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode was provided by podcastthemes.com.

CODACE Plant Stand

CppCast - Large Scale C++

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.

Your Ad Here

CODACE Plant Stand

CppCast - Large Scale C++

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.