CppCast - The C++ ABI

Starting point is 00:00:00 Episode 224 of CppCast with guest Titus Winters, recorded November 21st, 2019. This episode of CppCast is sponsored by Backtrace, the only cross-platform crash reporting solution that automates the manual effort out of debugging. Get the context you need to resolve crashes in one interface for Linux, Windows, mobile, and gaming platforms. Check out their new Visual Studio extension for C++ and claim your free trial at backtrace.io.cppcast. And by JetBrains, makers of smart IDEs to simplify your challenging tasks and automate the routine ones. Exclusively for CppCast, JetBrains is offering a 25% discount for a yearly individual license, new or update, on the C++ tool of your choice, CLion, ReSharper C++, or AppCode. Use the coupon code JETBRAINS for CppCast during checkout at www.jetbrains.com. In this episode, we discuss some news on C++ tools.

Starting point is 00:01:13 Then we talk to Titus Winters from Google. Titus talks to us about the C++ ABI and his new upcoming book on software engineering at Google. Welcome to episode 224 of CppCast, the first podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how's it going today? I'm doing all right, Rob. How are you doing? Doing good. How is CodeDive?

Starting point is 00:02:02 CodeDive is great. Well, it's a good opportunity to hang out with, well, some of these people, people that I've seen at five different conferences this year so far. Right. So, yeah, it's been, it's pretty crazy. Yeah. And is the conference over now? Yes.

Starting point is 00:02:21 So today was the last day. It's just a two-day conference. They said, I think I heard that there were 1,500 attendees for a two-day four-track conference, which seems kind of crazy, but it was really packed. A movie theater. We basically took over the entire movie theater. Oh, okay. And is CodeDive, it's not a strictly C++ conference, right? It is not strictly a C++ conference. There's also some talks

Starting point is 00:02:46 at least Python and JavaScript related. Although some of them still kind of came back to C++, the ones that I saw that were like about asm.js and why you need to compile your C++ to it or whatever.

Starting point is 00:03:00 Cool. Okay, well, on top of every piece of feedback um last week we got an email uh asking us to kind of give a plug for a new c++ user group that was starting off uh that one was the maryland c++ user group and we got another uh similar request this is from uh rob keelan writing hey rob and jason i just wanted to announce that i just started a c++ meetup group in pittsburgh we had our first meeting in november our next meetup will be on december 5th where robert seacord will be discussing security concerns of integer operations so yeah if you're in the

Starting point is 00:03:35 greater pittsburgh area you should check out uh the c++ user group it's called cpp pit and uh next meeting will be de 5th. Wow, that's pretty cool. I'm immediately curious, what is the security concerns of integer operations? Yeah, I don't know exactly what that talk is going to be

Starting point is 00:03:57 on. Sounds like an interesting topic, though. Yeah, definitely. Let's see. December 5th. All right, cool. Well, we'd love to hear your thoughts about the show as well. You can always reach out to us on Facebook, Twitter, or email us at feedback at cbguest.com. And don't forget to leave us a review on iTunes or subscribe on YouTube. Before I introduce today's guest, I just want to apologize for the audio of this episode. Due to some technical difficulties, Titus couldn't record in his home or office and was in a common space where his microphone did

Starting point is 00:04:23 pick up the sounds of some other people nearby, including a baby crying. I did my best to clean up the audio but you may still hear some background noise when Titus is talking. Joining us today is Titus Winters. Titus is a senior staff software engineer at Google where he has worked since 2010. Today he is the chair of the subcommittee for the design of the C++ Standard Library. At Google, he is the library lead for Google's C++ code base, 250 million lines of code that will be edited by 12,000 distinct engineers in a month. For the last nine years, Titus and his teams have been organizing, maintaining, and evolving the foundational components of Google's C++ code base using modern automation and tooling.

Starting point is 00:05:01 Along the way, he has started several Google projects that are believed to be in the top 10 largest refactorings in human history. As a direct result of helping to build out refactoring tools and automation, Titus has encountered firsthand a unique swath of the shortcuts that engineers and programmers may take to just get something working. That unique scale and perspective has informed all of his thinking on the care and feeding of software systems. His most recent project is the book Software Engineering at Google to be published by O'Reilly in late 2019, early 2020. Titus, welcome back to the show.

Starting point is 00:05:31 Thank you for having me. Thank you for having me. I really should maybe trim that bio down. You know, I was just sitting here thinking that do you really need to work on adding some more things to your bio? It seems a little slim at the moment. Yeah, sorry about that. I'll come back in.

Starting point is 00:05:54 I'll come back in a year and we'll see what we can do about that. It's really difficult to decide what exactly should be in your bio. It's like when I'm making a conference submission, it is the hardest part of the conference submission, right? It's not the talk. It's what should the bio be at the moment. Right. Yes. And there's not a universal accepted word count for those things.

Starting point is 00:06:18 So anyway, yes. 1,000 words or less. Yes. Thank you for having me. All of these things we'll be talking about, I'm sure. Yeah, we'll definitely be talking about more of some of this soon. But first, we've got just a couple of news articles to discuss. And I think all of our news articles today are kind of tooling related, so it's kind of appropriate that we have you on, Titus.

Starting point is 00:06:38 Cool. So feel free to comment on any of these. The first one is SourceTrail, is a uh software like i guess discovery and exploration tool that we've talked about before on the show right jason and it is going open source yeah i i had some private discussion with everhart about this and um it's uh i mean it's pretty exciting news i um i mean the yes okay so source trail is going open source but the crux of it is that they're hoping that patreon will help support the development moving forward um and they're totally going open with it for like how much support they currently have today and let's see

Starting point is 00:07:18 that's what the current number is 99 we can do Yeah. So the patron's only been open, I think for like three days. So it's definitely early, but if you're listening and you haven't heard a source trail yet, you should absolutely check it out. You know, we did do a podcast episode maybe like two years ago, Jason with them at least two years ago.

Starting point is 00:07:41 Yeah. It's, it's definitely a great tool. I've seen lots of people on, on Twitter who are, I guess checking it out for the first time. We seem to be enjoying it. least two years ago yeah it's it's definitely a great tool um i've seen lots of people on twitter who are i guess checking it out for the first time who seem to be enjoying it so if you're you know getting good use out of this tool you should really consider supporting on patreon i mention it to every class of students i have that you know it's it's something they should at least

Starting point is 00:07:58 look into if they need a better way to visualize their code uh and i think there's no problem there's no reason why they shouldn't be able to hit their code uh and i think there's no problem there's no reason why they shouldn't be able to hit their 1500 a month minimum goal to have people who can at least be paid to maintain the open source uh project yeah yeah absolutely like i think it's so critical like my my sre colleagues will say you can't run production systems without monitoring. And as a primarily focused on the code base sort of person, systems like that that give you analysis and tracking and monitoring, understanding of the code base, it's a direct comparison.

Starting point is 00:08:40 You can't function in a big project without stuff like that. Now, they do say in the article that, you know, if you get like 10 million lines of code or something, that it has a hard time keeping up. So it probably doesn't work at Google scale. Yeah, but most things don't. Right. Okay.

Starting point is 00:08:58 Next thing we have is an article on Bartek's coding blog. And this one is a guest post from Dennis Bakvalov. And it's all about performance analysis and getting started if you are interested in starting to profile your code. It doesn't go over any specific tools, but kind of goes in some general terms about how profiling works, what you need to do. And if you're interested in what this post is all about, Dennis has a his own blog where he talks about some of the stuff in a lot more detail often. So this kind of brings up a question for me, because at the conference this week, there's been lots of discussion about how much cash can affect performance and how much cash can affect performance

Starting point is 00:09:45 and how much micro benchmarks are actually meaningful in your project. And this kind of makes me think, since we have Titus here and you're talking like, you know, billions of lines of C++ code or whatever, at what point, like, can you perform benchmarks

Starting point is 00:10:06 on Google's scale code base? And is it meaningful? And can you deal with this? Yes. There is as much art as science in all of that. Okay. And yeah, like, cache issues absolutely dominate. There's a lot that you have to learn

Starting point is 00:10:24 on the difference in like relevance between micro benchmarks and, and sort of more macro benchmarking systems. Um, like we absolutely suggest that you start with micro benchmarks. Um, but at the same time, it's probably once a month that I get,

Starting point is 00:10:41 you know, email questions, uh, internally on, uh, things like, hey, my benchmark doesn't seem to make any sense. And it's just purely because they didn't use the result of a function call. And the optimizer was like, oh, disappear in a puff of optimization.

Starting point is 00:11:00 Oh, that's awesome. So there's stuff starting with that and then cache issues. So you have to come up with clever ways to design it to be a cold cache benchmark or a hot cache benchmark. But then micro benchmarks themselves don't really tell you the aggregate information that you actually care about. For something like Google, it's usually going to be like, how many queries per second can we actually handle? So it's a signal, like a good micro benchmark is a signal as far as performance goes. But in the end, it's going to be

Starting point is 00:11:34 much like the distinction between unit tests and integration tests and canary deployment, micro benchmarks, macro benchmarks, and then production monitoring. You need all of the above so uh one of the talks i was in today was uh from uh bjorn faller and it was um it was uh about uh the impact of uh cash uh on your c++ code and one of the comments that he made was you know he agreed that micro benchmarks can are not a great indicator of the

Starting point is 00:12:05 overall system impact. But if you can show fewer cache misses in your bench in your micro benchmark, that's going to help the overall system. And I was just curious if you have any comment on that. Oh, absolutely. I mean, I think that's effectively the major theme behind the Abseil hash tables designs. It tries to minimize branches, sure, but primarily it tries to minimize cache misses. And that's responsible for our current estimate of we're 30 or 40 times better than CID order map. That's ridiculous. okay but but yeah cache it's it's everything and this is also why like i i have to teach new programmers constantly like if you're using std map or std set like that is order login cache miss like that's a lot uh cache misses you know comparable to lot. Cache misses comparable to

Starting point is 00:13:05 just locking and unlocking a new text. Please, please don't do that. Alright. And then this last article we have is from the Visual C++ blog, and it's about a new tool that they're coming out with. It'll be available in

Starting point is 00:13:21 version 16.4. You can get it in the preview now if you're using preview builds. And this one they're calling C++ Build Insights. And, you know, we always complain about build times in the C++ world. So they're putting out this tool where you can just run this little vcperf command before and after your build and it will collect some information and then be able to visualize what parts of your build are taking the most time and hopefully help you then figure out how you can reduce that time and improve your build speeds so it looks like a pretty powerful tool i'm definitely going to check it out when 16.4 comes out. Now, Rob, could you tell in this article, it was not obvious

Starting point is 00:14:06 to me, if I'm using 16.4, but for business reasons I need to use the Visual Studio 14 compiler, can it still give me this output? I am not sure if they mentioned that. Hmm.

Starting point is 00:14:21 Yeah, I don't know. I mean, it says you need to use the VS 2019 command prompt to start and stop the performance analysis session. So I'm hoping that means you just need to have that command prompt and you don't need to be using a specific version of the compiler, but I'm not sure. Right. Because they've been very good recently about allowing you to use older compilers with the newer Visual Studio, right? Right, right. That's a good question. We might need to maybe ping someone on the Visual C++ team, see if they can answer that. See if Cy, I think, would be a good bet. Yeah.

Starting point is 00:14:56 Okay. And I'm assuming that Google doesn't use Visual Studio. No, we don't. It's not our primary. Well, at least it is not our primary no right okay well uh titus uh i think the main thing we want to have you talk about today is the abi and i was wondering if we could start off by having you define what exactly the abi is for listeners who maybe aren't too familiar with it? Yeah, sure. So I think in past talks, I think I discussed this in a talk I gave for Pacific plus plus in 2018. I described ABI

Starting point is 00:15:37 as sort of like a network protocol, imagine. It's the way that we're going to sort of define the binary structure of how we communicate between, in this case, not a client and a server, but the caller of a function and the callee. And so this is everything from how do you actually put in memory and registers the parameters that are going into the function, but also even just how you interpret the memory that those parameters are comprised of. So for example, we can look at std string in most implementations is going to be something like three words. And it's probably, say, a pointer to the allocation, and then two words for how much of that allocation is live, the size of the pointer, and the reserved size of the allocation, right? So capacity. But there's nothing in the standard that says what order those fields are in. And there's nothing in the standard that says what order those fields are in. And

Starting point is 00:16:46 there's nothing that even requires that that's a pointer and two integers. It could be two pointers and an integer or any combination of those things. And so if this comes up primarily when we're talking about building things at separate points in time. So shared libraries and DLLs, you encounter this a lot. And if the layout of your objects changes, or even just the interpretation of the bits that make up your object changes, then when you build that in one translation unit and one in your main, and then you try to read those bits on the other side of the gap in your SO or your DLL, then obviously that's not going to work in exactly the same way that it wouldn't work

Starting point is 00:17:38 if you were transmitting a network protocol that was talking in binary, and you changed the layout or changed the meaning of things. Or if you changed the format that you were using for zip compression or JPEG, the sender and the receiver have to agree on both the number of bits and the meaning of all of those bits. And when they don't, then we have a problem. The reason that this is all tricky is because we've, as a community, sort of been moving towards greater and greater binary compatibility. So old SOs, old DLLs, old builds of things are compatible uh over time and that means that whatever the standard

Starting point is 00:18:30 library implementers back you know 10 years ago were doing or maybe five years ago depending on your platform whatever they chose to do back then is now baked into the system forever and it's kind of holding us back and that's a hard question right so since we're talking about you know history a little bit right now uh my understanding is the abi did break when c++ 11 came out is that right so this is sort of platform dependent and okay there isn't one defined abi it is really platform. And the standard doesn't actually define ABI itself. It is a thing that is entirely up to the implementation.

Starting point is 00:19:10 Does the standard even mention the ABI? It doesn't, which makes all of this extra complicated. In 2011, it was well known to the committee that the GCC implementation of std string was copy on write.

Starting point is 00:19:27 And for various reasons, the committee decided that it was important to actually forbid the copy on write implementation. And that was understood that that was going to be a ABI break for GCC and for libstd C++ especially, because anything that was compiled with that copy-on-write string wouldn't work with a small string optimization string implementation instead. Even if it's the same number of bits, it's a different meaning of those bits. And so starting in you know 2011 2012 when they introduced the option to change to a standards compatible uh std string uh there's a you know a silent difficult fork in the community between which binary representation are you using for that and it's a thing that the the

Starting point is 00:20:22 the maintainers for libs did C++ complain about to this day. It is very costly because people rightly don't entirely understand the places that they might be depending on a binary representation of something as common as Stitt's string. So I think something I've kind of seen recently, recently enough to surprise me. When you go and build your C++ project on some version of Linux with GCC, you might see an error that says like, oh, GCC3 blah, blah, blah, whatever, not found.

Starting point is 00:20:58 And it's trying to do a linking to the standard library to some specific version because some library that you just included was expecting a different version of the C++ ABI. Does that sound about right? to the standard library to some specific version because some library that you just included and was expecting a different version of the C++ ABI. Does that sound about right? That does not surprise me. Yeah. Okay. And so part of the reason that this is difficult is that ABI is very natural for a language like C, right? Like we don't care about the representation or meaning of an int or a struct of two ints or any of the types that are available in C.

Starting point is 00:21:33 And so we have the same tooling ecosystem. We distribute the same SOs. We use the same linkers as the C ecosystem does. And yet, although everything in the ABI for C is basically fine and pretty reasonably assumed to be stable forever in C++, it's weird to even think that it might be. And we don't have any way to broadcast that to people. And so everyone winds up like sort of silently depending on it. And it's really, really holding us back right now. Right.

Starting point is 00:22:12 So that kind of brings us to your paper that you wrote before the Belfast C++ meeting, which is ABI Now or Never. We did discuss your paper in the news a couple episodes ago when we had Jean-Huidh Manidon, but could you maybe tell us a little bit about what brought you to write that paper and what the response was like in Belfast? Yeah, so what brought me to write the paper, I've been squealing to the committee for a little bit. Sorry, squealing. Yeah, I mean, really, that's kind of it. But specifically on the topic of, are we going to break ABI in 23?

Starting point is 00:22:56 Because if we are, we should figure out as much value as we can to pile it all in, because I don't know whether we'll be willing to do it again anytime soon. And if we aren't, we should also maybe just admit that we're probably not ever going to do it. Because at least under Linux platforms with C++ and Libstd C++, we've effectively been ABI compatible since 2011. And

Starting point is 00:23:17 if we're not doing that in 23, then the next chance is 26. That'll be 15 years of people having the ability to silently depend on this. Like Hiram's Law dominates, right? Like everything actually has hidden assumptions on the fact that ABI is stable. And if we actually try to change it at this point, like it might break the ecosystem completely but if we don't it's also really hard to say that the standard library cares about performance because it doesn't

Starting point is 00:23:54 i mentioned earlier like near as i can tell uh the api compatible like the source compatible The API-compatible, source-compatible node hash map in Abseil is, in many microbenchmarks, 30 or 40 times more efficient than an unordered map. Wow. And a reasonable regular expression implementation is probably hundreds of times more efficient than std reg. I like your characterization of a reasonable implementation. Yeah, like, and then, mean and clearly let's let's not get started on like hana's stuff like you can that doesn't even compare yeah right you know um and so like if we're we can't claim to be can't claim to be the systems language that puts performance above everything else if we're so

Starting point is 00:24:45 concerned with backwards compatibility that we're willing to leave orders of magnitude of performance in kind of important uh data types on the floor uh and that is the fundamental deep question for the the community and the committee right now so i i would like to say something I meant to say a minute ago, but missed the opportunity, is that I really like your explanation that the ABI is like a network protocol. I've never heard that explanation before, and I think it's very, very fitting.

Starting point is 00:25:18 But also, I'm sorry, go ahead. It's also noteworthy that if you look at any grown-up network protocols, they have a version number in them. Right. And the ABI doesn't, which makes this extra hard. Well, if it's not in the standard. Right, but none of the major platform ABIs have a version number in them. And as we all know, like if you want to upgrade from a thing that has no version number to a version number, even that is a break.

Starting point is 00:25:50 So fun times. So for my personal experience, I read Scott and Andre's C++ coding standards book where they effectively say, don't put C++ c++ objects across abi boundaries to which i said that's just silly i'll just recompile all the things and then i've also like early in my career would download various versions of boost and downloading boost on visual studio specifically is like hyper specific to which release of Visual Studio that you're using if you want the binary.

Starting point is 00:26:33 So for most of my career, I basically just assumed I can't trust a C++ binary. I'll just build the things that I need for my particular target. So what I would like to ask you is, what kinds of projects does this really affect? Who can't recompile their projects? People that have plug-in interfaces, people that are relying on pre-compiled, vendored code, things like that. In our experience, Google famously tries to build everything from source,

Starting point is 00:27:09 and we fight real hard, even with external clients and distributors and things to get source for everything. And even with that, when we recently had to make ABI changes in the standard library, and even for that, it probably cost us at least five engineer years. And we're usually pretty good at making changes and stuff. So where did that time actually go? That went to tracking down all the places that those dependencies had crept in and trying to push back on those teams and mitigate things. It went to coming up with mitigation strategies

Starting point is 00:27:46 where we, in some cases, may still have the old symbol available in a binary so that things continue to run. It isn't a solution that will last forever, but if you change the mangled name, which is the name that your linker is relying on, if you change the mangled name of a symbol, then you can have one type, and you get the new version whenever you're compiling code,

Starting point is 00:28:19 but the old binary representation may still coexist in your DLL or your SO. Sounds tricky. Yeah. We had to do a lot of tricky stuff. And we have almost no dependence on ABI. So I pretty much assume that Linux distributions and Adobe plugins and everything. A surprising number of things are actually shipping binaries or unable to recompile things.

Starting point is 00:28:57 Because it's been 10 years since anything changed on those platforms. It's not as bad under Windows, but in response to a lot of users complaining that they had to rebuild for every new Visual Studio release, Visual Studio has been better about not breaking ADI every release like they used to. And now they're sort of starting to see that they can't as easily as they used to either.

Starting point is 00:29:26 Yeah, it's almost like an irony that they've been better at doing something that perhaps if they had been worse at it, it would be better. Yes, because it is very easy for a user to want to not have to recompile code because it is sort of a tragedy of the commons right like if we had the ability to recompile things and like change the layout for an ordered map uh change stood hash um if we had that right we can also give everybody you know some significant performance boost for those use cases right but that's sort of diffuse and in the future, whereas it's very easy to complain about, ah, I want this to be compatible. Why doesn't this work right now?

Starting point is 00:30:11 I want to interrupt the discussion for just a moment to bring you a word from our sponsors. Backtrace is the only cross-platform crash and exception reporting solution that automates all the manual work needed to capture, symbolicate, dedupe, classify, prioritize, and investigate crashes in one interface. Backtrace customers reduced engineering team time spent on figuring out what crashed, why, and whether it even matters by half or more. At the time of error, Backtrace jumps into action, capturing detailed dumps of app environmental state. It then analyzes process

Starting point is 00:30:40 memory and executable code to classify errors and highlight important signals such as heap corruption, malware, and much more. Whether you work on Linux, Windows, mobile, or gaming platforms, Backtrace can take pain out of crash handling. Check out their new Visual Studio extension for C++ developers. Companies like Fastly, Amazon, and Comcast use Backtrace to improve software stability. It's free to try, minutes to set up, with no commitment necessary. Check them out at backtrace.io cppcast so going back to your your paper um what was the response from the rest of the committee when you brought to belfast uh so full disclosure i sent the paper to the reflectors and had quite a bit of discussion like people people are interested and engaged i sent the paper to the reflectors and had quite a bit of discussion like people

Starting point is 00:31:25 people are interested and engaged i know the discussion is coming but uh herb and i agreed that the primary goal for the belfast meeting was resolving and the comments on the c++ 20 draft the abi is a 23 problem that makes sense so delayed it. We're hoping that it will happen in Prague, but I'm also hearing that there's potentially questions about what rooms are available and things like that. So it will come up sometime in 2020, hopefully in Prague, but maybe not until the summer. And Prague is February? Yes, Prague is February. And the next meeting after that is Varna.

Starting point is 00:32:08 Varna, Bulgaria. Yes. Okay. Okay, but there was a new study group formed at Belfast, right? There's an ABI study group now? So it's not so much an ABI study group as, I think they're calling it a resource group. Okay. group as I think they're calling it a resource group, which is because it is very common for people to not understand the subtleties and nuance of, uh, is the, like, is a given proposal going

Starting point is 00:32:34 to break ABI? Uh, the, the ARG, the ABI resource group, uh, is guess, present primarily to answer questions of, if we made this change, is this an ABI break on any major platform? Okay, because right now, you'll only know that if, you know, a compiler implementer is sitting in the room and notices it. Yeah, sort of. Like, some of us, like, I've gotten better at spotting the things that I think are likely to be an ABI break, but there are still some times that I cannot do that work in my, I can't do that math in my head. So having some people that are, like, I mean, what is the probability that something gets approved for the standard and it takes six months or a year or more for someone to realize this is an ABI break and now we have to reconsider this thing that was already approved? So far in my experience, I don't think that's happened. We've had a couple cases where things made it maybe all the way to plenary, and then someone pointed out, oh, on my platform that would be an ABI break.

Starting point is 00:33:54 One that I'm reminded of. We added a variadic lock guard in C++ 17. Yes, right. And that couldn't be done with the I forget the name of which one's which. It couldn't be done by making the existing single template argument lock.

Starting point is 00:34:18 It couldn't be done by changing that to be variadic, because on some platforms, the mangled name of a class template of a single argument is different than the mangled name of a class template of variadic arguments where there was only one provided, which was subtle. There was a whole argument over, like, does this even matter?

Starting point is 00:34:40 Is this a class that could possibly be passed through API boundaries? Like, is anyone ever going to notice this uh but the uh the implementers sort of rallied and like no no even if it's only an api break in theory we're not going down this path yeah so for the so for the sake of our listeners it was i believe lock guard if i'm looking at this correctly that was added in c++ 17 that is the variadic version. And before that, we had unique lock, scoped lock, and the other ones. Yes, it was unique lock.

Starting point is 00:35:12 Yeah, and actually that example sort of illustrates some of the committee issue of it all, of the term ABI does not appear anywhere and the standard itself does not govern the ABI of anything. Uh, but we've been operating under a pretty solid, uh, regime of, if it's an ABI break, implementers basically have a complete veto over going down that path. Wow. Um, and I mean, it's certainly within their rights to decide how their product is going to work. But it does put us in this awkward position of we claim to be a language of performance above all else. And standard library is maybe not living up to that right now. Which could be fine.

Starting point is 00:36:02 Like I describe in the paper, paper, I don't think it's immediately the end of the world for C++ to admit that for the standard library, stability is more important. But it does mean pretty explicitly that if that's the case, you shouldn't be using the standard library if performance matters. You shouldn't be using

Starting point is 00:36:20 standard library simple types like vector and type traits and things like that. And then if you actually need performance, then you go to boost or folly or abseil, depending on what level of source compatibility you actually need. So the one aspect of this that I totally agreed with on your paper, or at least maybe if I was just reading between the lines, is that the committee has to decide that ABI is something that they are dealing with, one way or the other, right? So what has been, you said only on the reflector so far, you've gotten comments, but do you think that the committee will actually,

Starting point is 00:36:59 like at some point the standard will actually mention that the ABI is a thing? I don't think so, and I don't think that's what's really on the table. I think it is purely a question of deciding whether or not to honor that implementer veto. Okay. Because if the committee votes overwhelmingly that we're going to change these things that break ABI, it only really is going to work if all of the implementers are on board. Right. And if we decide that we're just like,

Starting point is 00:37:33 we're going to say that we break ABI and then no one ships it, that doesn't help anybody. And claiming that we're about performance and then actually just honoring the implementer veto on ABI issues is a little false to the community, right? And so regardless of whether we put the word ABI in standard, we really do have to have the discussion and come to consensus, both amongst, you know, interested committee members members but also amongst implementers of what are we doing? Because neither option is any good.

Starting point is 00:38:10 It is a significant shift in understanding of what the goals of the standard library are to admit that performance is not actually our priority. But it would also be the biggest disruption to the C++ ecosystem ever. performance is not, not actually our priority. Uh, but it would also, you know,

Starting point is 00:38:25 it will be the biggest disruption to the SQL plus ecosystem ever. If we decide to like do this properly. So that's fun. I think, uh, uh, yeah, I mean,

Starting point is 00:38:40 I've, I've, I've, I've always personally assumed that the zero overhead principle applied to built-in language features, not necessarily to standard library features. But I personally have no problem with the ABI breaking with every release, honestly. Right. Because the projects I've worked on, that's totally a viable option to be like, well, you have to recompile right and i think like uh i i say in a lot of talks and it's a major theme of the book uh that time is a very hard thing for us to deal with in

Starting point is 00:39:13 software and abi is effectively us like ignoring uh that time is going to to really hurt us as far as performance goes. We can't do this forever. We can do this for a while. I don't know how long. I'm honestly a little surprised that we haven't encountered a Spectre-style speculative execution bug that required some change to how do we actually invoke a function safely. And you have to do that. that's that's a huge a b library so all the current specter mitigations apply at the

Starting point is 00:39:52 call site huh uh am i is that wrong no i'm hesitating because i don't know enough to answer that question oh okay okay that's Okay. That's fine. Yeah. Since you just mentioned the book, do you want to tell us a little bit more about that? Yeah. So it's coming out through O'Reilly sometime soon-ish. I don't have an exact date of when it will hit the shelves. But yeah, for the last year and a half, a little bit more, I've been the lead for an effort to put together a book on software engineering as Google understands it.

Starting point is 00:40:30 And this is very much in line with the previous big book from Google, the Site Reliability Engineering book, where I tried to explain, like, okay, how do you have reliable production systems? And how do you have reliable production systems? And how do you, like, keep things running smoothly and scale up to, you know, managing data centers and all this? And my book is mostly about, like, how do you manage a code base of, you know, millions, hundreds of millions of lines of code and, like, keep that working for tens of thousands of engineers smoothly for decades.

Starting point is 00:41:06 And obviously that is not going to be the problem that most people have. Right. But I keep describing it as this is sort of a scouting report, right, of these are the things that were necessary in order to get to this scale. And a lot of them, I think, are perfectly fine, like reasonable ways to run a smaller organization. So if you want to, you know, have a smooth path that will scale up, maybe try this.

Starting point is 00:41:37 And then a lot of it is just like insights into like how to think about a particular problem. So questions of version control policy and dependency management and how to manage a build system and also cultural things like how to be a good teammate and how to do code reviews like a grown-up.

Starting point is 00:42:01 And so it really sort of covers the whole gamut through the primary lenses of how is this going to work over time how is this going to work as your organization scales up and what are you trading off uh when you make decisions in this space so you said primary are there co-authors so uh the book is something like 25 chapters. There are three major leads on it. Myself, my tech writer, Tom Mantrick, and Hiram Wright, who I think you've seen at conferences and the like. Yes. Sounds familiar.

Starting point is 00:42:40 Yeah. He talks a lot about refactoring things. Tom and Hiram and I started this, but because there's 25 or 30 chapters going, and we're not the experts on all of these things, we recruited experts from teams all across Google. I think there's something like 15 or 20 primary authors. We've been deeply involved in all of the chapters to make sure that it ties in with the major themes and that the chapters

Starting point is 00:43:09 reference one another and talk about the emerging themes across the piece as a whole but yeah it was the number of pages that have my name on it is perhaps relatively small and the number of pages that have my name on it is perhaps relatively small, and the number of pages that have my fingerprints on them is most of the pages.

Starting point is 00:43:28 So is this a topic that O'Reilly approached you about, or did you go to O'Reilly? We went to O'Reilly, sort of following with the contacts that we had made on the SRE book. But that was no effort. Right. I'm sure it wasn't. Yes, that sounds very good. So, yeah, it has been a fairly nice collaboration so far. And it's currently available for pre-order, right? Yes, it is currently available for pre-order.

Starting point is 00:44:00 They have, I don't know, something like half of the chapters are through most of their production process. We're still doing last copy edits on a couple things. We are hours away from Google being completely done. And then I don't know what all they wind up doing. And then at some point they have to find time on, as I understand it, some very, very big printers. And then there will be a book. Do you have any advice for anyone considering working on a book? Don't.

Starting point is 00:44:38 So, especially in tech, the thing that I should have understood but I didn't, that seems so obvious in retrospect, is in tech, it's a lot more nebulous right like you can know what you want to say but the question of did i write enough did i write too much is it clear enough is it punchy enough is it like does the message get through uh that's there's no unit test for that and that is a very i had a lot of anxiety over that yeah wow because it is a lot harder to figure out when you are done and figure out like okay i feel good about this but am i totally delusional so it's only been really in the last handful of months as more people have gotten their hands on on drafts of things that I've been starting to breathe a little easier. But yeah, it is a lot of work. Wow.

Starting point is 00:45:52 Okay. We talked last week all about other Belfast news. Is there anything else you wanted to share? Any highlights out of the most recent committee meeting? Not a lot, I think. And it was mostly addressing the quote-unquote bug reports? Yeah, exactly. It was a week-long process of bug report and code review, effectively.

Starting point is 00:46:17 Not a whole lot of new features to talk about. Not a whole lot of interesting development. At least not for rooms like my room. There was certainly some interesting stuff, as I understand it, in SG1 and in EWG. We did make some small progress on executors

Starting point is 00:46:38 in Library Evolution, mostly in terms of listening to another SG1 presentation on direction and design for executors and voting that we like the design as far as we understand it and everyone in LEWG will do the homework to be ready to evaluate whether we can send this on to library

Starting point is 00:47:04 in the next month. Obviously that's non-binding, but it is like we're on, I don't know, R11 or R12 of the executor's paper. So having, having actual forward progress be made in the form of votes like that is

Starting point is 00:47:22 really helpful. Regarding those bug reports, something that I think we forgot to mention in the last couple of episodes is I believe that one of the National Body comments addressed the fact that numeric had been overlooked for constexpr support. And I think that got through, if I understand correctly. So things like accumulate will be constexpr in 20 c++ 20 which was not true before belfast i believe that that's true um honestly we have like 130 envy comments

Starting point is 00:47:55 process in my room so it's kind of right uh but that does ring a bell so yeah i think that that is yeah that's well exciting to me it's the kind of thing I pay attention to. Yeah. Okay. Ty, is there anything else you wanted to go over before we let you go then? Broadly, yeah, what I said earlier holds. I think the ABI question is the biggest, most interesting, most pressing thing for the committee and the community. And I think it would be irresponsible of us not to try to hear from people

Starting point is 00:48:29 as to how much do you depend on pre-compiled artifacts, right? Do you have a.o or a.so or a.dll,.a floating around from 10 years ago that you can't rebuild? How bad would it be if an upgrade to c++ 23 forced you to rebuild everything i get i guess one more question i have for you before we let you go is you know in that paper you kind of presented what you see as the three options one of which being kind of keep going as is which is the most unacceptable option but i think you said you yourself are not sure which of the other two you

Starting point is 00:49:05 prefer either agree to break the abi or continue you know not breaking the abi and admit that performance is not the most important thing um do you have a preference on one of those two options yet or are you still debating it yourself uh so i think emotionally i want to break ABI, and I think that that's selfish, maybe. I don't think I can justify that being the smart thing for the language and the community as a whole. Because I think admitting that the standard library is not the be-all, end-all on performance, and that it is expected that you're going to go use like i said booster abseil or folly uh if you need performance like we could live with that i think right but um but it's also just like you know i i spent a couple years writing a book about how time

Starting point is 00:49:59 impacts software and it's sort of morally repugnant to me that we're promising something forever like if something is painful like the abi break from stood string in 11 right you have two choices you either never do that again or you figure out how to make it not less painful right and the problem's not going to go away like We're going to want to change things always. So I don't know. I know what I want, but I don't know that I want it well enough to vote. I think I would probably be neutral if it came down to actually having to raise a hand on it. Okay.

Starting point is 00:50:40 Well, we're definitely looking forward to seeing how the discussion goes amongst the committee next year when the paper is brought up. Yeah. And I strongly encourage everyone to reach out on Twitter or email or blog posts, things like that. Come up, talk to us at conferences, find a friendly standards committee member and reach out because this is a big deal. Okay. Well, it's been great having you on the show again, Titus. Cool. Thank you for having me. It's always a pleasure.

Starting point is 00:51:11 Thanks for coming on. Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in or if you have a suggestion for a topic. We'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter.

Starting point is 00:51:32 You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help support the show through Patreon. If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode is provided by podcastthemes.com.

CODACE Plant Stand

CppCast - The C++ ABI

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.

Your Ad Here

CODACE Plant Stand

CppCast - The C++ ABI

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.