CppCast - The C++ ABI
Episode Date: November 21, 2019Rob and Jason are joined by Titus Winters from Google. They first discuss some news of C++ tools, including Sourcetrail going open source and C++ Build Insights for Visual Studio. Then Titus goes into... what the C++ ABI is, what breaking the ABI means, and whether or not we should consider breaking the ABI in future versions of C++. Titus also shares some a preview of his upcoming book 'Software Engineering at Google.' News Pittsburgh C++ Meetup Group Sourcetrail is now free and open source Guide to Performance Analysis and Tuning Introducing C++ Build Insights Links ABI Now or Never Software Engineering at Google: Lessons Learned from Programming Over Time CppCon 2019: Titus Winters "Maintainability and Refactoring Impact of Higher-Level Design Features" CppCon 2019: Chandler Carruth, Titus Winters "What is C++" Hyrum's Law Sponsors Backtrace Software Crash Management for Embedded Devices and IOT Projects Announcing Visual Studio Extension - Integrated Crash Reporting in 5 Minutes JetBrains
Transcript
Discussion (0)
Episode 224 of CppCast with guest Titus Winters, recorded November 21st, 2019.
This episode of CppCast is sponsored by Backtrace, the only cross-platform crash reporting solution that automates the manual effort out of debugging.
Get the context you need to resolve crashes in one interface for Linux, Windows, mobile, and gaming platforms.
Check out their new Visual Studio extension for C++ and claim your free trial at backtrace.io.cppcast.
And by JetBrains, makers of smart IDEs to simplify your challenging tasks and automate the routine ones.
Exclusively for CppCast, JetBrains is offering a 25% discount for a yearly individual license, new or update, on the C++ tool of your choice, CLion, ReSharper C++, or AppCode.
Use the coupon code JETBRAINS for CppCast
during checkout at www.jetbrains.com. In this episode, we discuss some news on C++ tools.
Then we talk to Titus Winters from Google.
Titus talks to us about the C++ ABI
and his new upcoming book on software engineering at Google.
Welcome to episode 224 of CppCast, the first podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how's it going today?
I'm doing all right, Rob. How are you doing?
Doing good. How is CodeDive?
CodeDive is great.
Well, it's a good opportunity to hang out with, well, some of these people,
people that I've seen at five different conferences this year so far.
Right.
So, yeah, it's been, it's pretty crazy.
Yeah.
And is the conference over now?
Yes.
So today was the last day.
It's just a two-day conference. They said, I think I heard that there were 1,500 attendees for a two-day four-track conference,
which seems kind of crazy, but it was really packed.
A movie theater.
We basically took over the entire movie theater.
Oh, okay.
And is CodeDive, it's not a strictly C++ conference, right?
It is not strictly a C++ conference. There's also some talks
at least Python and
JavaScript related.
Although some of them still kind of came
back to C++, the ones that I saw that were
like about
asm.js and why you need to
compile your C++ to it
or whatever.
Cool.
Okay, well, on top of every piece of
feedback um last
week we got an email uh asking us to kind of give a plug for a new c++ user group that was starting
off uh that one was the maryland c++ user group and we got another uh similar request this is from
uh rob keelan writing hey rob and jason i just wanted to announce that i just started a c++
meetup group in pittsburgh we had our first meeting in november our next meetup will be on december 5th where robert
seacord will be discussing security concerns of integer operations so yeah if you're in the
greater pittsburgh area you should check out uh the c++ user group it's called cpp pit
and uh next meeting will be de 5th. Wow, that's
pretty cool.
I'm immediately
curious, what is the security
concerns of integer operations?
Yeah, I don't know
exactly what that talk is going to be
on. Sounds like an interesting topic, though.
Yeah, definitely. Let's see.
December 5th. All right, cool.
Well, we'd love to hear your thoughts about the show as well. You can always reach
out to us on Facebook, Twitter, or email us at feedback at cbguest.com. And don't forget to
leave us a review on iTunes or subscribe on YouTube. Before I introduce today's guest,
I just want to apologize for the audio of this episode. Due to some technical difficulties,
Titus couldn't record in his home or office and was in a common space where his microphone did
pick up the sounds of some other people nearby, including a baby crying. I did my best to clean up the audio but you may
still hear some background noise when Titus is talking. Joining us today is Titus Winters. Titus
is a senior staff software engineer at Google where he has worked since 2010. Today he is the
chair of the subcommittee for the design of the C++ Standard Library. At Google, he is the library lead for Google's C++ code base,
250 million lines of code that will be edited by 12,000 distinct engineers in a month.
For the last nine years, Titus and his teams have been organizing,
maintaining, and evolving the foundational components of Google's C++ code base
using modern automation and tooling.
Along the way, he has started several Google projects
that are believed to be in the top 10 largest refactorings in human history. As a direct result of helping to build out refactoring
tools and automation, Titus has encountered firsthand a unique swath of the shortcuts that
engineers and programmers may take to just get something working. That unique scale and perspective
has informed all of his thinking on the care and feeding of software systems. His most recent
project is the book Software Engineering at Google
to be published by O'Reilly in late 2019, early 2020.
Titus, welcome back to the show.
Thank you for having me.
Thank you for having me.
I really should maybe trim that bio down.
You know, I was just sitting here thinking that
do you really need to work on adding some more things to your bio?
It seems a little slim at the moment.
Yeah, sorry about that.
I'll come back in.
I'll come back in a year and we'll see what we can do about that.
It's really difficult to decide what exactly should be in your bio.
It's like when I'm making a conference submission,
it is the hardest part of the conference submission, right?
It's not the talk.
It's what should the bio be at the moment.
Right. Yes.
And there's not a universal accepted word count for those things.
So anyway, yes.
1,000 words or less.
Yes. Thank you for having me.
All of these things we'll be talking about, I'm sure.
Yeah, we'll definitely be talking about more of some of this soon.
But first, we've got just a couple of news articles to discuss.
And I think all of our news articles today are kind of tooling related,
so it's kind of appropriate that we have you on, Titus.
Cool.
So feel free to comment on any of these.
The first one is SourceTrail, is a uh software like i guess discovery and
exploration tool that we've talked about before on the show right jason and it is going open source
yeah i i had some private discussion with everhart about this and um it's uh i mean it's pretty
exciting news i um i mean the yes okay so source trail is going open source but the crux of it is
that they're hoping that patreon will help support the development moving forward um and they're
totally going open with it for like how much support they currently have today and let's see
that's what the current number is 99 we can do Yeah. So the patron's only been open,
I think for like three days.
So it's definitely early,
but if you're listening and you haven't heard a source trail yet,
you should absolutely check it out.
You know,
we did do a podcast episode maybe like two years ago,
Jason with them at least two years ago.
Yeah.
It's,
it's definitely a great tool.
I've seen lots of people on,
on Twitter who are, I guess checking it out for the first time. We seem to be enjoying it. least two years ago yeah it's it's definitely a great tool um i've seen lots of people on twitter
who are i guess checking it out for the first time who seem to be enjoying it so if you're you know
getting good use out of this tool you should really consider supporting on patreon i mention
it to every class of students i have that you know it's it's something they should at least
look into if they need a better way to visualize their code uh and i think there's no problem
there's no reason why they shouldn't be able to hit their code uh and i think there's no problem there's no reason
why they shouldn't be able to hit their 1500 a month minimum goal to have people who can at least
be paid to maintain the open source uh project yeah yeah absolutely like i think it's so critical
like my my sre colleagues will say you can't run production systems without monitoring.
And as a primarily focused on the code base sort of person,
systems like that that give you analysis and tracking and monitoring,
understanding of the code base, it's a direct comparison.
You can't function in a big project without stuff like that.
Now, they do say in the article that, you know,
if you get like 10 million lines of code or something,
that it has a hard time keeping up.
So it probably doesn't work at Google scale.
Yeah, but most things don't.
Right.
Okay.
Next thing we have is an article on Bartek's coding blog.
And this one is a guest post from Dennis Bakvalov. And it's all
about performance analysis and getting started if you are interested in starting to profile your
code. It doesn't go over any specific tools, but kind of goes in some general terms about how
profiling works, what you need to do. And if you're interested in what this post is all about,
Dennis has a his own blog where he talks about some of the stuff in a lot more detail often.
So this kind of brings up a question for me, because at the conference this week,
there's been lots of discussion about how much cash can affect performance and how much cash can affect performance
and how much micro benchmarks
are actually meaningful in your project.
And this kind of makes me think,
since we have Titus here
and you're talking like, you know,
billions of lines of C++ code or whatever,
at what point, like,
can you perform benchmarks
on Google's scale code base?
And is it meaningful?
And can you deal with this?
Yes.
There is as much art as science in all of that.
Okay.
And yeah, like, cache issues absolutely dominate.
There's a lot that you have to learn
on the difference in like relevance
between micro benchmarks and,
and sort of more macro benchmarking systems.
Um,
like we absolutely suggest that you start with micro benchmarks.
Um,
but at the same time,
it's probably once a month that I get,
you know,
email questions,
uh,
internally on,
uh, things like,
hey, my benchmark doesn't seem to make any sense.
And it's just purely because they didn't use the result of a function call.
And the optimizer was like, oh, disappear in a puff of optimization.
Oh, that's awesome.
So there's stuff starting with that and then cache issues.
So you have to come up with clever ways to design it to be a cold cache benchmark or a hot cache benchmark.
But then micro benchmarks themselves don't really tell you the aggregate information that you actually care about.
For something like Google, it's usually going to be like, how many queries per second can we actually handle?
So it's a signal,
like a good micro benchmark is a signal as far as performance goes.
But in the end, it's going to be
much like the distinction between unit tests
and integration tests and canary deployment,
micro benchmarks, macro benchmarks,
and then production monitoring.
You need all
of the above so uh one of the talks i was in today was uh from uh bjorn faller and it was um
it was uh about uh the impact of uh cash uh on your c++ code and one of the comments that he
made was you know he agreed that micro benchmarks can are not a great indicator of the
overall system impact. But if you can show fewer cache misses in your bench in your micro benchmark,
that's going to help the overall system. And I was just curious if you have any comment on that.
Oh, absolutely. I mean, I think that's effectively the major theme behind the Abseil hash tables designs.
It tries to minimize branches, sure, but primarily it tries to minimize cache misses.
And that's responsible for our current estimate of we're 30 or 40 times better than CID order map.
That's ridiculous. okay but but yeah cache it's it's everything and this is also why like i i have to teach new programmers
constantly like if you're using std map or std set like that is order login cache miss like that's a
lot uh cache misses you know comparable to lot. Cache misses comparable to
just locking and unlocking a new text.
Please, please don't do that.
Alright.
And then this last article
we have is from the Visual C++
blog, and it's about
a new tool that they're coming out with.
It'll be available in
version 16.4. You can get it in the
preview now if you're using preview builds.
And this one they're calling C++ Build Insights.
And, you know, we always complain about build times in the C++ world.
So they're putting out this tool where you can just run this little vcperf command before and after your build and it will collect some information and then be able
to visualize what parts of your build are taking the most time and hopefully help you then figure
out how you can reduce that time and improve your build speeds so it looks like a pretty
powerful tool i'm definitely going to check it out when 16.4 comes out. Now, Rob, could you tell in this article, it was not obvious
to me, if
I'm using 16.4, but
for business reasons I need to use the
Visual Studio 14 compiler,
can it still give me this output?
I am not
sure if they mentioned that.
Hmm.
Yeah, I don't know. I mean, it says you need to use
the VS 2019 command prompt to start and stop the performance analysis session. So I'm hoping that means you just need to have that command prompt and you don't need to be using a specific version of the compiler, but I'm not sure.
Right. Because they've been very good recently about allowing you to use older compilers with the newer Visual Studio, right?
Right, right.
That's a good question.
We might need to maybe ping someone on the Visual C++ team, see if they can answer that.
See if Cy, I think, would be a good bet.
Yeah.
Okay.
And I'm assuming that Google doesn't use Visual Studio.
No, we don't.
It's not our primary. Well, at least it is not our primary no right
okay well uh titus uh i think the main thing we want to have you talk about today is the abi
and i was wondering if we could start off by having you define what exactly the abi is for
listeners who maybe aren't too familiar with it? Yeah, sure. So I think in past
talks, I think I discussed this in a talk I gave for Pacific plus plus in 2018. I described ABI
as sort of like a network protocol, imagine. It's the way that we're going to sort of define the binary structure of how we communicate
between, in this case, not a client and a server, but the caller of a function and the callee.
And so this is everything from how do you actually put in memory and registers the parameters that are going into the function,
but also even just how you interpret the memory that those parameters are comprised of.
So for example, we can look at std string in most implementations is going to be something like three words. And it's probably, say, a pointer
to the allocation, and then two words for how much of that allocation is live, the size of the pointer,
and the reserved size of the allocation, right? So capacity. But there's nothing in the standard
that says what order those fields are in. And there's nothing in the standard that says what order those fields are in. And
there's nothing that even requires that that's a pointer and two integers. It could be two pointers
and an integer or any combination of those things. And so if this comes up primarily when we're
talking about building things at separate points in time.
So shared libraries and DLLs, you encounter this a lot.
And if the layout of your objects changes, or even just the interpretation of the bits that make up your object changes,
then when you build that in one translation unit and one in your main,
and then you try to read those bits on the other side of the gap in your SO or your DLL,
then obviously that's not going to work in exactly the same way that it wouldn't work
if you were transmitting a network protocol that was talking in binary,
and you changed the layout or changed
the meaning of things. Or if you changed the format that you were using for zip compression
or JPEG, the sender and the receiver have to agree on both the number of bits and the meaning
of all of those bits. And when they don't, then we have a problem.
The reason that this is all tricky is because we've, as a community,
sort of been moving towards greater and greater binary compatibility.
So old SOs, old DLLs, old builds of things are compatible uh over time and that means that whatever the standard
library implementers back you know 10 years ago were doing or maybe five years ago depending on
your platform whatever they chose to do back then is now baked into the system forever and
it's kind of holding us back and that's a hard question right so since
we're talking about you know history a little bit right now uh my understanding is the abi
did break when c++ 11 came out is that right so this is sort of platform dependent and okay there
isn't one defined abi it is really platform. And the standard doesn't actually
define ABI itself. It is a thing
that is entirely up to the implementation.
Does the standard
even mention the ABI?
It doesn't, which makes all of this
extra complicated.
In 2011,
it was well known to the committee
that the GCC
implementation of std string was copy on write.
And for various reasons, the committee decided that it was important to actually forbid the copy on write implementation.
And that was understood that that was going to be a ABI break for GCC and for libstd C++ especially,
because anything that was compiled with that copy-on-write string
wouldn't work with a small string optimization string implementation instead.
Even if it's the same number of bits, it's a different meaning of those bits.
And so starting in you know 2011 2012 when they introduced the option to change to a
standards compatible uh std string uh there's a you know a silent difficult fork in the community
between which binary representation are you using for that and it's a thing that the the
the maintainers for libs did C++ complain about to
this day. It is very costly because people rightly don't entirely understand the places that they
might be depending on a binary representation of something as common as Stitt's string.
So I think something I've kind of seen recently, recently enough to surprise me.
When you go and build your C++ project
on some version of Linux with GCC,
you might see an error that says like,
oh, GCC3 blah, blah, blah, whatever, not found.
And it's trying to do a linking to the standard library
to some specific version
because some library that you just included
was expecting a different version of the C++ ABI. Does that sound about right? to the standard library to some specific version because some library that you just included and
was expecting a different version of the C++ ABI. Does that sound about right?
That does not surprise me. Yeah. Okay. And so part of the reason that this is difficult is that ABI
is very natural for a language like C, right? Like we don't care about the representation or meaning of an int
or a struct of two ints or any of the types that are available in C.
And so we have the same tooling ecosystem.
We distribute the same SOs.
We use the same linkers as the C ecosystem does.
And yet, although everything in the ABI for C is basically fine and
pretty reasonably assumed to be stable forever in C++, it's weird to even think that it might be.
And we don't have any way to broadcast that to people. And so everyone winds up like sort of silently depending on it.
And it's really, really holding us back right now.
Right.
So that kind of brings us to your paper that you wrote before the Belfast C++ meeting, which is ABI Now or Never.
We did discuss your paper in the news a couple episodes ago when we had Jean-Huidh Manidon, but could you maybe tell us a little bit about
what brought you to write that paper and what the response was like in Belfast?
Yeah, so what brought me to write the paper, I've been
squealing to the committee for a little bit.
Sorry, squealing. Yeah, I mean, really, that's
kind of it.
But specifically on the topic of, are we going to break ABI in 23?
Because if we are, we should figure out as much value as we can to pile it all in,
because I don't know whether we'll be willing to do it again anytime soon.
And if we aren't, we should also maybe just admit that we're probably not ever going to do it. Because at least
under Linux platforms
with C++ and
Libstd C++, we've effectively
been ABI compatible since
2011. And
if we're not doing that in 23, then
the next chance is 26.
That'll be 15 years of
people having the ability to silently depend on this.
Like Hiram's Law dominates, right?
Like everything actually has hidden assumptions on the fact that ABI is stable.
And if we actually try to change it at this point, like it might break the ecosystem completely but if we don't
it's also really hard to say that the standard library cares about performance because it doesn't
i mentioned earlier like near as i can tell uh the api compatible like the source compatible The API-compatible, source-compatible node hash map in Abseil is, in many microbenchmarks, 30 or 40 times more efficient than an unordered map.
Wow.
And a reasonable regular expression implementation is probably hundreds of times more efficient than std reg.
I like your characterization of a reasonable implementation.
Yeah, like, and then, mean and clearly let's let's
not get started on like hana's stuff like you can that doesn't even compare yeah right you know um
and so like if we're we can't claim to be can't claim to be the systems language that puts
performance above everything else if we're so
concerned with backwards compatibility that we're willing to leave orders of magnitude of performance
in kind of important uh data types on the floor uh and that is the fundamental deep question for the
the community and the committee right now so i i would like to say something I meant to say a minute ago,
but missed the opportunity,
is that I really like your explanation
that the ABI is like a network protocol.
I've never heard that explanation before,
and I think it's very, very fitting.
But also, I'm sorry, go ahead.
It's also noteworthy that if you look at any grown-up network protocols, they have a version number in them.
Right.
And the ABI doesn't, which makes this extra hard.
Well, if it's not in the standard.
Right, but none of the major platform ABIs have a version number in them. And as we all know, like if you want to upgrade from a thing that has no version number to a
version number,
even that is a break.
So fun times.
So for my personal experience,
I read Scott and Andre's C++ coding standards book where they effectively
say,
don't put C++ c++ objects across abi boundaries
to which i said that's just silly i'll just recompile all the things and then i've also
like early in my career would download various versions of boost and downloading boost on visual
studio specifically is like hyper specific to which release of Visual Studio that you're using if you want the binary.
So for most of my career, I basically just assumed I can't trust a C++ binary. I'll just build the things that I need for my particular target. So what I would like to ask you is,
what kinds of projects does this really affect?
Who can't recompile their projects?
People that have plug-in interfaces,
people that are relying on pre-compiled,
vendored code, things like that.
In our experience,
Google famously tries to build everything from source,
and we fight real hard, even with external clients and distributors and things to get source for everything.
And even with that, when we recently had to make ABI changes in the standard library,
and even for that, it probably cost us at least five engineer years.
And we're usually pretty good at making changes and stuff.
So where did that time actually go?
That went to tracking down all the places that those dependencies had crept in
and trying to push back on those teams and mitigate things.
It went to coming up with mitigation strategies
where we, in some cases,
may still have the old symbol available in a binary
so that things continue to run.
It isn't a solution that will last forever,
but if you change the mangled name,
which is the name that your linker is relying on,
if you change the mangled name of a symbol, then you can have one type,
and you get the new version whenever you're compiling code,
but the old binary representation may still coexist in your DLL or your SO.
Sounds tricky.
Yeah.
We had to do a lot of tricky stuff.
And we have almost no dependence on ABI.
So I pretty much assume that Linux distributions and Adobe plugins and everything.
A surprising number of things are actually
shipping binaries or unable to recompile things.
Because it's been 10 years
since anything changed on those platforms. It's not as bad
under Windows,
but in response to a lot of users complaining that they had to rebuild for every new Visual Studio release,
Visual Studio has been better about not breaking ADI
every release like they used to.
And now they're sort of starting to see
that they can't as easily as they used to either.
Yeah, it's almost like an irony that they've been better at doing something that perhaps if they had been worse at it, it would be better.
Yes, because it is very easy for a user to want to not have to recompile code because it is sort of a tragedy of the commons right like if we had the ability to
recompile things and like change the layout for an ordered map uh change stood hash um if we had
that right we can also give everybody you know some significant performance boost for those use
cases right but that's sort of diffuse and in the future,
whereas it's very easy to complain about,
ah, I want this to be compatible.
Why doesn't this work right now?
I want to interrupt the discussion for just a moment
to bring you a word from our sponsors.
Backtrace is the only cross-platform crash and exception reporting solution
that automates all the manual work needed to capture,
symbolicate, dedupe, classify, prioritize,
and investigate crashes in one interface. Backtrace customers reduced engineering team time spent on figuring out what
crashed, why, and whether it even matters by half or more. At the time of error, Backtrace jumps
into action, capturing detailed dumps of app environmental state. It then analyzes process
memory and executable code to classify errors and highlight important signals such as heap corruption, malware, and much more. Whether you work on Linux, Windows, mobile, or gaming
platforms, Backtrace can take pain out of crash handling. Check out their new Visual Studio
extension for C++ developers. Companies like Fastly, Amazon, and Comcast use Backtrace to
improve software stability. It's free to try, minutes to set up, with no commitment necessary.
Check them out at
backtrace.io cppcast so going back to your your paper um what was the response from the rest of
the committee when you brought to belfast uh so full disclosure i sent the paper to the reflectors
and had quite a bit of discussion like people people are interested and engaged i sent the paper to the reflectors and had quite a bit of discussion like people
people are interested and engaged i know the discussion is coming but uh herb and i agreed
that the primary goal for the belfast meeting was resolving and the comments on the c++ 20 draft
the abi is a 23 problem that makes sense so delayed it. We're hoping that it will happen in Prague,
but I'm also hearing that there's potentially questions about what rooms are available and things like that.
So it will come up sometime in 2020, hopefully in Prague, but maybe not until the summer.
And Prague is February?
Yes, Prague is February.
And the next meeting after that is Varna.
Varna, Bulgaria.
Yes.
Okay.
Okay, but there was a new study group formed at Belfast, right?
There's an ABI study group now?
So it's not so much an ABI study group as, I think they're calling it a resource group.
Okay. group as I think they're calling it a resource group, which is because it is very common for
people to not understand the subtleties and nuance of, uh, is the, like, is a given proposal going
to break ABI? Uh, the, the ARG, the ABI resource group, uh, is guess, present primarily to answer questions of, if we made this change,
is this an ABI break on any major platform? Okay, because right now, you'll only know that if,
you know, a compiler implementer is sitting in the room and notices it.
Yeah, sort of. Like, some of us, like, I've gotten better at spotting the things that I think
are likely to be an ABI break, but there are still some times that I cannot do that work in my, I can't do that math in my head.
So having some people that are, like, I mean, what is the probability that something gets approved for the standard and it takes six months or a year or more for someone to realize this is an ABI break and now we have to reconsider this thing that was already approved?
So far in my experience, I don't think that's happened. We've had a couple cases where things made it maybe all the way to plenary,
and then someone pointed out, oh, on my platform that would be an ABI break.
One that I'm reminded of.
We added a variadic lock guard in C++ 17.
Yes, right.
And that couldn't be done with the
I forget the name of which one's which.
It couldn't be done by making
the existing single template
argument lock.
It couldn't be done by changing that
to be variadic, because on
some platforms, the mangled
name of a class template of a single argument
is different than the mangled name of a class template
of variadic arguments where there was only one provided,
which was subtle.
There was a whole argument over, like, does this even matter?
Is this a class that could possibly be passed through API boundaries?
Like, is anyone ever
going to notice this uh but the uh the implementers sort of rallied and like no no even if it's only
an api break in theory we're not going down this path yeah so for the so for the sake of our
listeners it was i believe lock guard if i'm looking at this correctly that was added in c++
17 that is the variadic version.
And before that, we had unique lock, scoped lock, and the other ones.
Yes, it was unique lock.
Yeah, and actually that example sort of illustrates some of the committee issue of it all,
of the term ABI does not appear anywhere and the standard itself does
not govern the ABI of anything. Uh, but we've been operating under a pretty solid, uh, regime of,
if it's an ABI break, implementers basically have a complete veto over going down that path.
Wow. Um, and I mean, it's certainly within their rights to decide how their product is going to work.
But it does put us in this awkward position of we claim to be a language of performance above all else.
And standard library is maybe not living up to that right now.
Which could be fine.
Like I describe in the paper, paper, I don't think it's
immediately the end of the world for C++
to admit that for the standard
library, stability is more important.
But it does mean pretty explicitly
that if that's the case,
you shouldn't be using the standard library if performance
matters. You shouldn't be using
standard library simple types
like vector and type
traits and things like that.
And then if you actually need performance, then you go to boost or folly or abseil, depending on what level of source compatibility you actually need.
So the one aspect of this that I totally agreed with on your paper, or at least maybe if I was just reading between the lines, is that the committee has to decide that ABI is something that they are dealing with,
one way or the other, right?
So what has been, you said only on the reflector so far,
you've gotten comments, but do you think that the committee will actually,
like at some point the standard will actually mention that the ABI is a thing?
I don't think so, and I don't think that's what's really on the table.
I think it is purely a question of deciding whether or not to honor that implementer veto.
Okay.
Because if the committee votes overwhelmingly that we're going to change these things that break ABI,
it only really is going to work if all of the implementers are on board.
Right.
And if we decide that we're just like,
we're going to say that we break ABI and then no one ships it,
that doesn't help anybody.
And claiming that we're about performance and then actually just honoring the implementer veto on ABI issues
is a little false to the community, right? And so regardless of whether we put the word ABI
in standard, we really do have to have the discussion and come to consensus, both amongst,
you know, interested committee members members but also amongst implementers
of what are we doing?
Because neither option is any good.
It is a
significant shift in
understanding of
what the goals of the standard library are
to admit that performance
is not actually our
priority. But it would also
be the biggest disruption to the C++ ecosystem ever. performance is not, not actually our priority. Uh, but it would also, you know,
it will be the biggest disruption to the SQL plus ecosystem ever.
If we decide to like do this properly.
So that's fun.
I think,
uh,
uh,
yeah,
I mean,
I've,
I've,
I've,
I've always personally assumed that the zero overhead principle applied to built-in language features, not necessarily to standard library features.
But I personally have no problem with the ABI breaking with every release, honestly.
Right.
Because the projects I've worked on, that's totally a viable option to be like, well, you have to recompile right and i think like uh i i say in a lot of
talks and it's a major theme of the book uh that time is a very hard thing for us to deal with in
software and abi is effectively us like ignoring uh that time is going to to really hurt us as far as performance goes.
We can't do this forever.
We can do this for a while.
I don't know how long.
I'm honestly a little surprised that we haven't encountered a Spectre-style
speculative execution bug that required some change to
how do we actually invoke a function safely.
And you have to do that. that's that's a huge a b library so all the current specter mitigations apply at the
call site huh uh am i is that wrong no i'm hesitating because i don't know enough to
answer that question oh okay okay that's Okay. That's fine. Yeah.
Since you just mentioned the book,
do you want to tell us a little bit more about that?
Yeah. So it's coming out through O'Reilly sometime soon-ish.
I don't have an exact date of when it will hit the shelves.
But yeah, for the last year and a half, a little bit more,
I've been the lead for an effort to put together a book on software engineering as Google understands it.
And this is very much in line with the previous big book from Google,
the Site Reliability Engineering book,
where I tried to explain, like,
okay, how do you have reliable production systems?
And how do you have reliable production systems? And how do you, like, keep
things running smoothly and scale up to, you know, managing data centers and all this? And my book
is mostly about, like, how do you manage a code base of, you know, millions, hundreds of millions
of lines of code and, like, keep that working for tens of thousands of engineers smoothly for decades.
And obviously that is not going to be the problem that most people have.
Right.
But I keep describing it as this is sort of a scouting report, right,
of these are the things that were necessary in order to get to this scale.
And a lot of them, I think, are perfectly fine,
like reasonable ways to run a smaller organization.
So if you want to, you know, have a smooth path that will scale up,
maybe try this.
And then a lot of it is just like insights into like
how to think about a particular problem.
So questions of version control policy
and dependency management
and how to manage a build system
and also cultural things
like how to be a good teammate
and how to do code reviews like a grown-up.
And so it really sort of covers the whole gamut
through the primary lenses of
how is this going to work over time how is this going to work as your organization scales up and
what are you trading off uh when you make decisions in this space so you said primary
are there co-authors so uh the book is something like 25 chapters.
There are three major leads on it.
Myself, my tech writer, Tom Mantrick, and Hiram Wright, who I think you've seen at conferences and the like.
Yes. Sounds familiar.
Yeah. He talks a lot about refactoring things. Tom and Hiram and I started this, but because there's 25 or 30 chapters going,
and we're not the experts on all of these things,
we recruited experts from teams all across Google.
I think there's something like 15 or 20 primary authors.
We've been deeply involved in all of the chapters
to make sure that it
ties in with the major themes
and that the chapters
reference one another and talk about
the emerging themes
across the piece as a whole
but yeah
it was
the number of pages that have my name on it
is perhaps relatively small
and the number of pages that have my name on it is perhaps relatively small, and the number of pages that have my fingerprints on them is most of the pages.
So is this a topic that O'Reilly approached you about, or did you go to O'Reilly?
We went to O'Reilly, sort of following with the contacts that we had made on the SRE book.
But that was no effort.
Right. I'm sure it wasn't.
Yes, that sounds very good.
So, yeah, it has been a fairly nice collaboration so far.
And it's currently available for pre-order, right?
Yes, it is currently available for pre-order.
They have, I don't know, something like half of the chapters are through most of their production process.
We're still doing last copy edits on a couple things.
We are hours away from Google being completely done.
And then I don't know what all they wind up doing.
And then at some point they have to find time on, as I understand it, some very, very big printers.
And then there will be a book.
Do you have any advice for anyone considering working on a book?
Don't.
So, especially in tech, the thing that I should have understood but I didn't, that seems so obvious in retrospect, is in tech, it's a lot more nebulous right like
you can know what you want to say but the question of did i write enough did i write too much is it
clear enough is it punchy enough is it like does the message get through uh that's there's no unit test for that and that is a very i had a lot of anxiety over that
yeah wow because it is a lot harder to figure out when you are done and figure out like okay
i feel good about this but am i totally delusional so it's only been really in the last handful of
months as more people have gotten their hands on on drafts of things that I've been starting to breathe a little easier.
But yeah, it is a lot of work.
Wow.
Okay.
We talked last week all about other Belfast news.
Is there anything else you wanted to share?
Any highlights out of the most recent committee meeting?
Not a lot, I think.
And it was mostly addressing the quote-unquote bug reports?
Yeah, exactly. It was a week-long process
of bug report and code review, effectively.
Not a whole lot of new features to talk about. Not a whole lot of interesting
development. At least not for rooms like
my room. There was certainly some
interesting stuff, as
I understand it, in SG1
and in EWG.
We did make some small
progress on executors
in Library Evolution,
mostly in terms
of listening to
another SG1 presentation
on direction and design for executors
and voting that we like the design as far as we understand it
and everyone in LEWG will do the homework
to be ready to evaluate whether we can send this on to library
in the next
month.
Obviously that's non-binding,
but it is like we're on,
I don't know,
R11 or R12 of the executor's paper.
So having,
having actual forward progress be made in the form of votes like that is
really helpful.
Regarding those bug reports,
something that I think we forgot to mention in the last
couple of episodes is I believe that one of the National Body comments addressed the fact
that numeric had been overlooked for constexpr support.
And I think that got through, if I understand correctly.
So things like accumulate will be constexpr in 20 c++ 20 which
was not true before belfast i believe that that's true um honestly we have like 130 envy comments
process in my room so it's kind of right uh but that does ring a bell so yeah i think that that
is yeah that's well exciting to me it's the kind of thing I pay attention to.
Yeah.
Okay.
Ty, is there anything else you wanted to go over before we let you go then?
Broadly, yeah, what I said earlier holds.
I think the ABI question is the biggest, most interesting, most pressing thing for the committee and the community. And I think it would be irresponsible of us
not to try to hear from people
as to how much do you depend on pre-compiled artifacts, right?
Do you have a.o or a.so or a.dll,.a
floating around from 10 years ago that you can't rebuild?
How bad would it be if an upgrade to
c++ 23 forced you to rebuild everything i get i guess one more question i have for you before we
let you go is you know in that paper you kind of presented what you see as the three options one of
which being kind of keep going as is which is the most unacceptable option but i think you said you
yourself are not sure which of the other two you
prefer either agree to break the abi or continue you know not breaking the abi and admit that
performance is not the most important thing um do you have a preference on one of those two options
yet or are you still debating it yourself uh so i think emotionally i want to break ABI, and I think that that's selfish, maybe.
I don't think I can justify that being the smart thing for the language and the community as a whole.
Because I think admitting that the standard library is not the be-all, end-all on performance,
and that it is expected that you're going to go use like i said
booster abseil or folly uh if you need performance like we could live with that i think right but um
but it's also just like you know i i spent a couple years writing a book about how time
impacts software and it's sort of morally repugnant to me that we're promising something forever
like if something is painful like the abi break from stood string in 11 right you have two choices
you either never do that again or you figure out how to make it not less painful right and
the problem's not going to go away like We're going to want to change things always.
So I don't know.
I know what I want, but I don't know that I want it well enough to vote.
I think I would probably be neutral if it came down to actually having to raise a hand on it.
Okay.
Well, we're definitely looking forward to seeing how the discussion goes amongst the committee next year when the paper is brought up.
Yeah. And I strongly encourage everyone to reach out on Twitter or email or blog posts, things like that.
Come up, talk to us at conferences, find a friendly standards committee member and reach out because this is a big deal.
Okay.
Well, it's been great having you on the show again, Titus.
Cool.
Thank you for having me.
It's always a pleasure.
Thanks for coming on.
Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast.
Please let us know if we're discussing the stuff you're interested in
or if you have a suggestion for a topic.
We'd love to hear about that too.
You can email all your thoughts to feedback at cppcast.com.
We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter.
You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter.
We'd also like to thank all our patrons who help support the show through Patreon.
If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast.
And of course, you can find all that info
and the show notes on the podcast website
at cppcast.com.
Theme music for this episode
is provided by podcastthemes.com.