CppCast - Distributing C++ Modules
Episode Date: December 16, 2021Rob and Jason are joined by Daniel Ruoso and Bret Brown from Bloomberg. They first talk about Jason's new Object Lifetime Puzzle book and a blost post from Kevlin Henney on Agile processes. Then they ...talk to Daniel and Bret about their research into using Modules at Bloomberg, and some of the changes still needed from compilers and build systems to use Modules in large scale software development. News Boost v1.78 Object Lifetime Puzzlers Book 1 Agility ≠Speed Links p2409: Requirements for Usage of C++ Modules at Bloomberg p2473: Distributing C++ Module Libraries What Agile is Actually About? CppCon - Lessons Learned from Packaging 10,000+ C++ Projects - Bret Brown & Daniel Ruoso - CppCon 2021 CppCon - Modern CMake Modules - Bret Brown - CppCon 2021 Sponsors Use code JetBrainsForCppCast during checkout at JetBrains.com for a 25% discount
Transcript
Discussion (0)
Thank you. exclusively for cpp cast jetbrains is offering a 25 discount for purchasing or renewing a yearly
individual license on the c++ tool of your choice cline resharper c++ or app code use the coupon In this episode, we talk about object lifetimes and agile development.
Then we talk to Daniel and Brett from Bloomberg.
Daniel and Brett talk to us about modules and what changes need to be
made to use them. Welcome to episode 329 of CppCast, the first podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
I'm all right, Rob. How are you doing?
Doing okay. Getting ready for the holiday
season. Yeah. If you're ordering Christmas presents, you need to order them today, I think.
I think I'm all done with my Christmas shopping. Are you done or you still got more to do?
I think I'm done. I'm pretty sure I'm done. That's good. I actually started a couple months ago. I'm
like, ooh, I'm gonna order that before I forget about it. Okay.
Well, at the top of every episode, I always read a piece of feedback.
We got a lot of great feedback from last week's episode.
It was a lot of fun talking to Kate and Guy.
This tweet's from Rob Bernstein saying this episode was fantastic.
I didn't want to turn it off.
I love the interaction between the four of you and all the journeys into different topics.
I've pre-ordered the book.
Can't wait to dive in.
Yeah.
We got a lot of feedback on Twitter for that in. We got a lot of feedback on Twitter.
We got a lot of good feedback.
The one thing we did get also was the most Reddit comments on any post ever, I think.
But it was not about the show at all.
I saw a comment.
Yeah.
200 posts of just someone ranting about Rust versus C++.
That had nothing to do with the episode.
That sounds great.
Yeah.
So I was a little sad to see that there were that many comments
that had nothing to do with the show.
So welcome.
We've got lots of feedback, though.
Yeah.
Well, we'd love to hear your thoughts about the show.
You can always reach out to us on Facebook, Twitter, or email us at feedback at cppcast.com.
And don't forget to leave us a review on iTunes or subscribe on YouTube.
Joining us today is Daniel Russo.
Daniel is the manager for code governance at Bloomberg with a focus on driving large-scale static analysis and automated refactoring. Daniel has been working the past 20-plus years
with a persistent lens on how to help engineers
be more effective with build, deployment, and analysis tooling
on various different environments and languages
with a more recent focus on bringing C++ modules
to a state where they can be used by more people.
Daniel is a Brazilian music nerd
and will talk endlessly about that if you let him.
He also plays classical guitar.
Daniel, welcome to the show.
Thank you for having me.
Yeah, welcome.
How large is the Code Governance team, if I might ask?
So we have around 25 people between full-time employees and contractors.
We're driving, we have a static analysis pipeline that runs around 20 projects per second,
running various tools across the entire code base
that we can see.
Wow.
So it's been a journey.
We have a lot of tools that we write in-house
for specific refactorings,
for getting rid of old dead code,
for feature switches that are considered fully deployed,
and for removing blockers for migrating to newer
platforms and those kinds of things. So it's been a wild journey. Wow, sounds like that's pretty
crazy. Yeah. Also joining us today is Brett Brown. Brett is the team lead for the Bloomberg Build
Tools team, focusing on compilation tool chains, build systems, and large-scale code migrations.
Brett likes improving C++ software development by treating projects more like cattle and less like pets.
In October at CVBCon, Brett presented a talk on CMake modules
and co-presented a talk on packaging C++ with Daniel.
Both Brett and Daniel are actual participants in the ISO Tooling Study Group.
Brett is a lead and founding member of the C++ Guild at Bloomberg.
He contributes to Bloomberg Working Groups on C++ tooling, testing, conferences, deprecations, and ISO engagement.
Brett, welcome to the show.
Hey, glad to be on.
Can you give us the elevator pitch on what you mean between more like cattle and less like pets?
Yeah, I mean, it seems like there's a lot of reasons for this, but it seems like a lot of C++ projects people treat as a special thing,
and then they end up adding special build rules, special directory layouts, all sorts of things.
And at the scale we're operating at Bloomberg, like Daniel was talking about, that gets to be really, really expensive. And even at a smaller scale, if you have a side project, you probably
don't want to sink a lot of time into just keeping things running, right? So really what people want
is something a little more regular so that they can use the same CI pipeline on all their projects.
They can use the same analysis techniques. It works with all their IDEs. And a lot of the
bespoke things that people add are actually the things that get in the way of that.
So a lot of what I focus on in my free time, my fun projects are ways to get those kinds of
pain points out of the way to let people have all the features they want without having to bake those features into the physical project they're working on, if that makes sense.
So my talk about CMake modules is, for instance, a technique to get your bespoke CMake out of your project and have that live as something you can reuse across your projects.
Very cool.
All right.
Well, Brett and Daniel, we've got a couple news articles to discuss.
Feel free to comment on any of these,
and then we'll start talking more about the work you're doing at Bloomberg with modules and everything, okay?
All right.
First thing we have is a new version of Boost.
It's always worth mentioning these.
This is version 1.78, and lots of changes to ASIO
file system,
BoostBeast. I don't think there's
any new libraries here, are there?
I didn't see any new ones, no.
No new libraries added.
Anything in particular anyone wanted to
point out in all these
changelogs? I think it's interesting
there's a breaking change to a regex
those are parsed. Oh, I missed that. That's interesting, There's a breaking change to a regex. Those are parsed.
Oh, I miss that.
That's interesting.
Yeah.
Capital B versus lowercase B as per Perl behavior.
That's, um, that's, that's kind of fascinating.
Cause I mean, there's like a lot of boost regex code out there right now.
Right?
Like if one of the projects that I'm working on, I might still use boost regex.
I think we may have moved it to standard.regx.
I'm not sure now.
But if it's still using Boost.regx and if it had this in here,
it might be a while before that was noticed that it broke a parsing in some of our code.
I'm not too familiar.
I don't know regex well enough to know what this breaking change means with slash capital B or slash lowercase B.
I don't know what that means.
Slash B is the word boundary.
Okay.
And then capital B would be not a word boundary, right?
Okay.
So just like lowercase S is any white space, capital S is any non-white space.
Okay.
So they're changing to make it look more like Pearl or less like Pearl?
More like Pearl is what it seems to be.
I've got a confession to make that I'm a long time Pearl guy.
That's,
um,
so you speak like Hubert basically.
I also find it interesting that there's so many changes to as you are,
because,
you know,
we've talked a lot about the possibility of standardizing networking
and with the reluctance to ever break ABI in the standard,
but it does seem like something like ASIO,
it's a moving target.
I mean, they're constantly updating the network libraries
and it just makes me question,
does networking actually have a place in the standard or not?
There was a conversation on C++ now in 2019
where we were talking about this idea
of adding graphics library to the standard.
And at some point, I was making the comment,
do we really want a GCC to depend on Xorg or Wayland?
Right.
And it's kind of like a ridiculous proposition
on the face of it.
But I understand why people want it because the lack of package management in C++ means that the only way for you to reasonably do this in a way that's not full of bespoke solutions is to actually put it in the standard.
I guess since we don't have a package manager, I guess everything has to go in the standard so we can do it in a uniform way.
Yeah, even for less controversial features like threading support, we've actually had problems like adopting a Bloomberg using standard stuff because there's implicit dependencies between
your stable standard library and things that are significantly less stable, like 1TBB, right?
Yeah, libTBB does not have any API promise at this point. So if you're using execution from C++17
and your ship, you're executable in a non-containerized environment,
it's very likely that you don't know that you risk breaking at runtime.
Right. All right.
Jason, do you want to talk about this next one?
Because this next news item is coming from you, right? So I thought, what the heck, I'm going to go ahead and make it into a full-on puzzle book. So I just released my first puzzle book.
It is called Object Lifetime Puzzlers Book 1, which implies that I plan to release more than one of them.
Did any of you have the chance to look at this at all, by the way?
Yeah, I'm really bad at these things.
Like there's this Twitter post about what does this do?
Oh, no, no, no, no, no, no.
Let's be clear. No, no, no, no.
This is explicitly
not Shafiq's polls.
Okay.
These are fun.
Well, it was fun. So I did.
I turned on The Sound of Silence by Simon Agarifunkle
and I started doing the puzzle
and I got
Be Sure to Drink your Ovaltine.
Is that the right answer to the, to the sample you had?
I think that's right.
Yeah, that's right.
That's right.
That's right.
Yeah.
So, uh, if you, if you actually do get the book, like I do recommend people get the print
copy of it because it is a puzzle book and you want to write in it.
Um, it actually does explain, uh, it explain, it covers automatic lifetime, static lifetime,
dynamic, and thread local object lifetime. And it's all just like sorting out when does an object
lifetime end? And when does it begin? When does it begin? When does it end? And then you fill in
the puzzle and it explains it all along the way. So my niece, who's actually visiting at the moment,
happened to be here right when the proof copy came from Amazon. So my niece, who's actually visiting at the moment, happened to be here right
when the proof copy came from Amazon. So I just handed it to her. She has absolutely no experience
or interest in programming at all and had to ask two questions from me and then just tore into it
and spent like the next three hours doing the puzzles. She had like a lot of fun with that.
That's really neat. Yeah. I mean, we get this stuff.
These are the kind of edge cases we hit
when we're upgrading a compiler.
And the compiler suddenly changes
when the destructor gets called.
There's always a trigraph involved.
I mean, not a trigraph,
like a ternary operator involved for some reason.
But yeah.
Well, it shouldn't.
Anything that happened,
if you're not talking a global static,
then your compiler was broken previously because these things are well
defined.
That's the point of C plus plus.
Right.
Yeah.
So I'm not arguing with you.
A lot of times our bugs are like,
this is weird edge case.
And it's not,
it was broken before.
And now it's hopefully less broken now, but it's still kind of broken, but it's changing, it was broken before and now it's hopefully less broken now,
but it's still kind of broken,
but it's changing or it's broken in a different way.
None of these,
none of these straddle the edge of a stud of the line of edge cases.
They all work on all compilers.
It's all verified code.
So that's good though.
You know,
maybe we should pass this around Bloomberg.
One of the migrations we're doing is to get rid of the other kinds of
statics you were just talking about. The code base, the global ones. Yeah. Maybe we should start passing this around Bloomberg. One of the migrations we're doing is to get rid of the other kinds of statics you were just talking about in the code base.
The global ones.
Yeah.
Maybe we should start passing this around Bloomberg.
It's like, do these things instead.
Because that's what we need.
We're like, okay, if you want static lifetimes, these are the defined things.
Stop putting them in the top of your header files and stuff like that.
So I did actually, for the record, intentionally meant this to be fun.
So hopefully at least some people have fun with that.
Like I said, I know non-programmers can have fun with it.
My actual concern is that C++ programmers aren't going to have fun because they're going to question everything that's in there.
And trust me, it is all well-defined behavior.
Like, just get that out there.
There's no, like, I even had one of my patrons almost immediately be like, I don't think that's defined behavior.
I'm like, yes, it is.
I've been teaching this for five years.
It is.
But C++ programmers don't trust the compiler.
People who aren't programmers just trust the rules that I put in the book.
All right.
Yeah.
The person probably even started with, um, actually.
No, no, no.
All my patrons are really cool.
Okay. And then the last thing we have here is a blog post from Kevin Henley, who was definitely a great guest we had on a while ago.
And this is, you know, he's kind of having a take on Agile and how maybe we're not tracking the right things with Agile.
Does anyone have a better description of this one?
So I've been using Agile for like a really long time like i started with extreme
programming back in like 2001 uh kind of thing and this is consistently been the hardest thing
to explain to people that like the point of us measuring those things is not to make it go faster.
The point is to make sure that when I'm talking to the customer that I'm saying,
we think that we're going to deliver this in four weeks from now,
that I'm actually going to deliver that four weeks from now.
And that it's better to say, I'm not going to deliver,
than to say, I'm going to deliver and not do it.
And this is consistently the thing that people get wrong when thinking about Agile is just thinking that the point is to go faster.
But in my experience, the main point is to be predictable.
I think it's also good that Agile helps you think about where am I going? Am I still going the right
way? You have little checkpoints to do that sort of thing.
And the point, as he says in the article,
isn't to be fast and sprint, per se.
It's to have those shorter increments
where you're thinking, okay, what about now?
Am I doing the right thing now?
And not wait a month, a year, or worse to decide,
oh, wait, that whole time I was just wasting my time.
I should have done something else.
So his analogy to vectors and mathematics was really effective.
I thought that was a good way to explain it.
Like you need, yeah, you want a vector that has a long magnitude
in this thought experiment,
but you also want it to have the right direction, right?
That's the point of a vector.
It's not just, so if you're talking about how fast are you going
in a two-dimensional space or something, it doesn't matter where you point it.
It's not just how long the arrow is that you're drawing or something.
And that goes for software projects, too.
Are we going the right way?
Going the right way is way more important than going fast, because you can go fast in the wrong way and end up worse off than you started.
So that was a really good point. And if we're continuing the analogy that you're making there,
we also kind of want to find what the unit vector is
to get that predictability.
That's a good point, I didn't think about that.
I have a standard rant on this conversation.
If you go to my LinkedIn, you'll find this article, I can put a link somewhere
later, I guess, on how most people miss the point of Agile. And I think this article was
really interesting in bringing that point. All right. So Brett and Daniel, I guess what's a good place to start?
Because you have been working on switching to modules at Bloomberg, and it sounds like you've kind of faced some challenges,
have some ideas for things that might need to change in order for others to properly use modules.
Is that right?
Yeah, I guess we could start chronologically.
Okay.
From Daniel and my perspective, in the last few years, we've had really good success going to a more standard, more regular build system, treating our projects like cattle, et cetera, at Bloomberg.
We've been using CMake as a core technology for that.
We've been engaging with Gitware on certain kinds of CMake enhancements.
I was doing some research, and Daniel was jumping in, too, on, okay, well, this module stuff seems cool.
I see these articles about we're almost ready, or we have this branch in this one project where you can maybe do modules with your build system.
So I was approaching it from, like, okay, what do I need to do?
What do I need to plan for the next year to three years or something for, like, okay, how do we get modules running when our compilers are caught up at Bloomberg, right? And I didn't see the places
to hook up to implement a build system to do modules the way we need to do them at Bloomberg.
In particular, I didn't see the way to integrate the layer between packaging and building. It's a
very important layer. It goes both directions. Like you pull in information from your environment
to build,
and then you provide information to the destination as a package when you ship.
And that information wasn't there. Your dependency, you depend on a module. You say import
foobar. What does that mean? You go to some file system somewhere. There's some interface source
file somewhere. You compile it somehow, like that
information wasn't there. And then in the standard, to be fair, doesn't say anything about any of this,
it talks about what goes inside of your source file. In fact, the standard doesn't say it's a
file, it just says it's, you know, a translation unit, right? And it was very explicit about not
saying these things, right? Right. Okay. So the challenge we're concerned about is like, okay, I don't know, as someone that does a lot of this, how to get modules working at Bloomberg, basically, right?
So Daniel took the lead on writing a paper, and that's P2409, if I got the number right, to kind of describe what we're worried about and why we think we're kind of
blocked to even get started on that. And a lot of that gets into the details of what I was just
saying there. And so that paper was presented to study group 15 at ISO, the tooling working group,
study group, sorry. And we've done subsequent conversations since. That's probably a good introduction, I think.
There's a lot to drill into exactly what the problems are,
which use cases are not supported yet,
and maybe where we can go with that.
But I think that's a good intro.
I think there is one iconic aspect of this,
which is that most people think,
most people that are not involved in the modules conversation
or implementation,
think that part of the implementation that is happening is that there's going to be like an interoperable format
where you can ship like a module file as like a pre-parsed thing that anyone can consume.
And so it's going to be super fast because you don't have to parse that anymore. But the reality is that with the exception of Visual Studio, the scope of compatibility for modules right now is the same for the module interface.
The binary module interface files right now is the same as the module, the scope of compatibility of precompiled headers, which is kind of like literally the same compiler binary.
Like even a different build of the compiler
may result in it being like an incompatible format.
So that when we have those conversations,
like I always say that we have a lot of these like,
oh moments where it's like,
you explain some things and it's like, oh.
Yeah.
Like how would you call Clang Tidy on Tidy with a dependency that comes from your system?
So let's back up a second.
Bloomberg, when it ships packages, uses almost exactly what you've probably seen already in Linux distributions.
We use just normal Linux-style package management.
You would install a library on your system. Your build system would know how to find that library in your system using either a combination of conventions and maybe some metadata like package config or something
like that. Right. So that's what we do at Bloomberg. Part of the reason we're escalating
this is because it's clearly not a Bloomberg specific problem. Like everyone's going to have
this problem. And we were kind of wondering, why isn't anyone talking about this? I thought people
would have issues open on this problem or something. And so that's why we're trying to,
you know're talking about
making sure people to raise awareness about
we didn't want to do packaging
in C++ standard yet
at least, but we got to
figure out some kind of packaging standards
really, at least when it comes to shipping pre-compiled
libraries that
contain modular interfaces.
So anyway, so going
back to Clang-Tidy, so let's say you installed a package like Boost, right?
And you want to build,
but you want to analyze your code, right?
But that code says import Boost.fileSystem, right?
Your Clang-Tidy is more or less
for the purposes of this conversation, a compiler, right?
But you don't necessarily have a build system, right?
Most people point out a compile commands.json file or something like that people point out a compilecommands.json file
or something like that.
But that compilecommands.json file
is describing how to execute
with whatever your production compiler is
or whatever your local development compiler is,
which probably isn't Clang-tidy.
It's maybe a different version of Clang at best.
It could be GCC, it could be ICC,
it could be anything else.
So those compile flags are not compatible between compilers.
And there's no, like Daniel was saying, there's no standard metadata that's pre-compiled sitting on the machine that all those different tools can go look at to say, okay, what's inside this module?
What are the declarations?
What are the classes, et cetera?
And keep in mind that modules can have pound includes in them.
So you also need to have all the classic old flags of preprocessor definitions, include search paths, things like that.
So that's a use case for, okay, well, how do we get Clang-tidy to work?
But it goes for everything.
How does your IDE know what's inside that module?
How does your other analysis tools?
Yeah. other analysis tools. Yeah, there's like, and again, like how do you get different
packages in a
directed graph of dependencies?
How do you get those packages to know
okay, well, this thing is exporting
that interface from this other module
as part of its interface.
You have to go walk the transitive graph to find
all this metadata for everything.
And we have a particular
interest in making sure that we can do that without a very tightly coupled integration into the build system.
Like I talked before about how we do like 20 projects per second static analysis.
We can do that because we extract only the information about translation units through like an image that we send through a distributed compute
platform that executes all those
static analysis. But now
if we're saying
that you need to integrate this
deeply into the build system because
the build system needs to
tell Clang-Tidy how to produce
the different module interfaces
because your compile
command just receives the path to the BMI
that is compatible with that compiler.
So how do you go about
introspecting a build system
from the outside
to be able to run a static analysis tool
if the only thing you see
is the path to the BMI
that's compatible to this compiler?
Right.
Another way to phrase that
is in the real world,
you often have to take,
you have to use multiple build systems in parallel, maybe spread out over time to build a C++
project, like all in all of its dependencies, right? So there's built systems need to have an
interoperable format to say, okay, I'm a build system and I provided this module pre-built,
here you go. And there's some other build system because they, oh, okay, okay. You have this module
and has these classes. I got it. Right. But like, that's something we don you go. And there's some other build system because they, oh, okay, okay. You have this module and has these classes.
I got it, right?
But like, that's something we don't have.
So that's what Daniel's second paper is about,
is about how to just kind of describe that broadly
and more importantly, also how to discover that.
Like, okay, so there's a file in the system now
in the second paper, at least that's proposed.
There's file in the system now.
It has some properties of a compile commands.json file
and that'll have like include directories and stuff now. It has some properties of a compile commands dot JSON file
and that'll have like
include directories and stuff.
And it's just,
it has some properties
of a package config PC file.
And that has,
it's kind of discoverable.
You can use,
you can model a directed
graph of dependencies.
You can discover the dependencies
without having to
transitively declare
all of them
in your build system,
which is something
that people struggle with now,
which is like,
okay, so you want to,
you want to add this library. Well, guess what? You have to go find all of its transitive dependencies and add them also to your build system, which is something that people struggle with now, which is like, okay, so you want to add this library.
Well, guess what?
You have to go find all of its transitive dependencies
and add them also to your build system.
And that's a real pain in the neck.
And that's something, again, that's not cattle.
And worse, the people you're depending on can't change things anymore
because that breaks your build because now you pinned a thing
that they didn't pin, and it's horrible, right?
So anyway, that's the problem we're talking about.
So everything is terrible.
It's just a lot of work.
It's not, you know, I want to make sure people realize
there's a lot of work.
There's plenty of room for people to come in
and try things out, implement them,
share their use cases.
We're talking about this in the SG15 mailing list.
That's something people can ask to subscribe to
and they can see some of these conversations.
We've been beating the bushes around the community,
trying to get more people from relevant parts of the C++ ecosystem
involved in these conversations in the scope of SG15.
We've had some good success there from different kinds of vendors
that provide packaging, vendors and other organizations
that provide packaging systems, provide build systems, provide
tool chains.
To some degree, the impact on this, other than a lot of these
things are breaking, one of the things I'm also concerned about would just be what happens
if one of the vendors figures out that something that works really well for them and then another vendor finds something else
out that works really well for them. And then vendor finds something else out that works really well for them and then
we bloomberg have to figure out a third thing that works for us and then guess what we're like
do modules really work at that point like on paper at the language spec sure but in practice as an
end user like if you're doing a training class and you say it's really easy if you want to do
modules you just do import foobar and then there's this 10 000 lines you need to add your cmake list
it's really easy.
And you have to hard code all these flags
for all the things you ever pull in.
Part of the point of all this is that we start getting out of that habit
and just pushing it into the build system isn't really
a solution. So we kind of need to come up with a way
to kind of describe these things.
So you all have mentioned
a bunch of different compilers and a bunch of different build
tools so far. And just out of curiosity,
what are you actually focused on using GCC visual studio?
See,
make,
is that your main concern?
Like where's the focus?
Um,
all the above.
Okay.
We have a,
we have a,
like I said,
we have a Linux style distribution.
Basically.
Um,
we take open source projects as tarballed and released by them in most cases.
And maybe with patches as few as possible,
we'll actually build those and provide them to our engineers.
So we do have a few, I mean, you name it,
we probably have at least a few projects
using that build system, for example.
As far as toolchains go,
we mostly target Unix Linux systems,
although some of our teams do support Windows in different ways.
We don't want to be necessarily pinned
to that subset of the ecosystem forever.
Like it is nice to have the option to do more windows or to ship to max or
something at some point.
Right.
So GCC recently says that they support the C++ 20 modules,
but clang is still doing clang style modules,
or is this that difference affecting you all?
Yeah, we're, we're, we're using pretty stable compiler builds right now so it's not even it's more of a medium to long-term goal
that we would even use modules at bloomberg it's like seriously okay um so this is me staying ahead
of my concern is like if we don't start working now like we're going to be talking about a decade
or two at bloomberg right not not just like waiting for GCC or Clang to catch up to some feature. Okay, got it. Yeah, our main concern
is not necessarily the compiler support, although that is absolutely necessary for anything to
happen. But the main thing is, how do we make it actually work when actual people are actually
typing in their editors kind of thing. Actual people involved.
If you remove those from the equation, I think this would all be much easier.
And the compilers, right?
Yes.
Remove the people in the compilers.
We have no problems.
Yeah.
So, I mean, we're right in the gear house of like,
hey, this is exactly the kind of organization that should really benefit from modules.
We'll have really deep dependency trees.
It'd be great not to have to pre-process like all the world just to,
you know,
do a hello world kind of thing.
You know,
it'd be nice to include IO stream without blowing up your build times
exponentially or something.
Right.
But like,
it's like,
well,
if I can't put it in the build system,
that doesn't matter though.
Right.
So,
um,
yeah.
Wonder if the discussion for just a moment
to bring you a word from our sponsor.
CLion is a smart cross-platform IDE
for C and C++ by JetBrains.
It understands all the tricky parts of modern C++
and integrates with essential tools
from the C++ ecosystem,
like CMake, Clang tools, unit testing frameworks,
sanitizers, profilers, Doxygen, and many others.
CLion runs its code analysis to detect unused and unreachable code, dangling pointers, missing
typecasts, no matching function overloads, and many other issues.
They are detected instantly as you type and can be fixed with a touch of a button while
the IDE correctly handles the changes throughout the project.
No matter what you're involved in, embedded development, CUDA, or Qt, you'll find specialized support for it.
You can run debug your apps locally, remotely, or on a microcontroller, as well as benefit from the collaborative development service.
Download the trial version and learn more at jb.gg.cppcast-clion.
Use the coupon code JETBRAINS for CppCast during checkout for a 25% discount off the price of a yearly individual license.
So are you getting kind of good feedback from the committee with these papers?
You know, are others recognizing this as a problem?
Yes. Short answer is yes. The long answer is the awkward part about this is that a lot of the problems that modules have is because we don't have consensus right now is precisely on
whether we should compromise for, like, can we just make modules work? Because the second paper,
the P2473, it is about, let's make modules work with the ecosystem as it is now. And the place
where we don't have a consensus is, is that a good idea?
Like, shouldn't we just push ahead and actually fix the underlying package management problem, which would be great.
And I started some conversations in the SG15 mailing list and with some people in other package management systems.
But it's an immensely complex problem to solve.
So I'm not super confident that if we put a quarter of operations
to be first package management, then solve the modules
that will not have something usable before 2030 or something.
So the half a loaf we're shooting for, or a quarter of a loaf,
or however much of a loaf we're talking about, I guess,
is maybe we can start with initial standards for... Not a standard, that's the wrong word, because again, an ISO, that's a special word that means special things.
But, you know, a document describing what a package would look like for a C++ project, which no such document exists.
Like, what is a package?
Like, is this a good package?
Could you look at a package and say, this has the stuff you need to be a C++ package?
Forget where it comes from. Forget how
it's compressed. Forget how you download it.
Forget what its name is. But if you looked at
it, does it have the minimal amount of
metadata possible?
That's not something that has been
done before, as far as I'm aware.
It should be really good for the
package management ecosystem.
Maybe someday it could be something we build on
to make an actual package.
I don't know if I ever have a package manager.
It'd be nice to at least have interoperable packaging
such that you enable the new package manager
instead of port to it, if that makes sense.
Because that's kind of the experience right now.
It's like, okay, I'm going to port to Conan
or I'm going to port to VC package
or I'm going to port to make a build so I can know, Arch ecosystem stuff like it's a porting kind of exercise, just like going to a new OS or something, right?
Yeah, if we could get to the point where we standardize, and again package managers communicate with build systems, if just that language becomes more interoperable, it would be a huge step forward.
Because build system would have, instead of having to export a package config file, a CMake export file, a Conan file, you would be able to just export this one language
and then other build systems would be able to read that language
and be able to consume those libraries.
But even that, like when you start talking,
like I was talking with Tobin from this PAC project
and he deals with all the HPC world,
which is way more complicated than I ever thought could be.
And all the ABI zoo that they have
makes any serious attempt of answering that question
way more complicated than one would expect.
And the thing that is really the challenge here
is when you look at folks using a system like we do,
for instance, which is like a simple package management
deployed with Linux file system hierarchy standards
based deployment,
package config is kind of good enough, right?
Like you have dash I's, you have dash L's,
you have dash D's and that's okay.
So it's a really hard lift to say like,
oh, you know, all these people
that have a really like convenient solution
that works well for them
because they have a very specific scenario,
that's not good enough.
You've got to go to this very much more complicated thing
because in order for us to normalize things,
you need to take care of the use cases
that you don't care about.
Right, right.
And the use case that's the simplest is,
okay, just throw everything into the user include directory
and hit the build button, right?
Like that in particular is just not going to work going forward.
Like, again, you have to,
the build system actually has to model the graph of the things you depend
on instead of just say here,
a compiler,
here's a dash I flag have fun.
Right.
So it gets,
yeah,
again,
actually,
we didn't actually make a fine on this detail,
but again,
Daniel kind of over it,
but the big thing we want,
one of the big things we want for modules is that you don't have to go
parse that code over and over again.
Right.
We want to say I parsed that once, maybe even like an hour ago.
I'm not, I just grab it. I don't want to reparse that, right? For that to work, you need a file
on the disk somewhere or a database or something that has that information cached. For that to
exist, your build system has to tell your compiler to make it. So, and for that to work, again,
you have transitive dependencies. The build system has to be able to walk the graph and make sure all those files exist. So you're building things
that aren't in your project, but not as textually included into your source file.
That's something that's new here. That's something people sometimes ask us. What's new about modules?
Why isn't this a problem already? And that's one of the answers is like, well, we want to be able
to only build the one time. And so to build only the one time, you have to know when to build the one time. So you have to know when to,
so you have to say like, oh, I don't have
a binary module interface
file for boost file system.
Make one, put it in your build directory, and we'll use that
going forward. And like Daniel said,
depending on if you have different compilers, you might have
to have a set of those and know which one to pick.
And worse than that, depending, like,
we still have textual inclusion because especially
C libraries and stuff will have that, so you still need to, you know, tweak those if somebody has slightly different dash D flags or something. So we'll need like check sums to know, this is not the one you want, you want this other one that use the dash D something equals one instead of dash D something equals zero, that kind of thing. Right. The good news is all that should be cleaned up over time. But in theory, as long as you're not including C APIs too much.
So there's a lot of work here.
I liked a lot of work coming out of the talks I saw at CppCon in October.
There were a lot of great talks, a lot of cool things about,
hey, we can use concepts this way and that way.
A lot of them kind of had this implication like,
yeah, this will all be header-only kind of, if you do it this way, but we'll have
modules someday. So when we have modules,
like, we don't care about header-only anymore, right? It's all
just a module interface. It's still private.
You don't have to worry about leaking your stuff into
other people's ecosystems or blowing up their compile
times, right? Because modules will solve that.
It's like, well, yeah, maybe, if we could
get adoption at a certain scale, and that's
kind of what Daniel and I were talking about. Like, okay, well, how do we actually
get there? What's the roadmap?
So we're kind of flexible on the roadmap to some degree,
but we would like to, like we just talked about
with the Agile thing, it's nice to take little steps
if you can, instead of trying to like, you know,
boil the ocean up front.
I'm not opposed to boiling the ocean
if we can all get on the same page
about boiling the ocean, though, I guess.
Sounds like an ecological disaster, though.
Well, I guess we are kind of on the ocean slowly, aren't we?
Yeah, slowly.
Okay.
I just want to take like a two steps back to make sure for like my own sake here that
I'm please.
Yeah.
Okay.
The current state of the art for modules means that if I want to use modules for my work
right now, I have to rebuild that package on my system with my compiler, with my flags to get that module.
And then I can use it moving forward.
And there's simply no way to distribute these things to anything other than my specific build configuration.
Right. The current state of the art is there are a few build systems.
And I think build two's advertised as being one of
them and cmake there's some branches that people are doing research on maybe you can whip up your
own you know gmake version of your own thing daniel has an example of how you might do that in
his second paper p 24 to 73 right um anyway the concern the state of the art is and these are the
kinds of things you see in these talks about modules and like, hey, look at the compile times I got.
They're so much better. You have one build system
and it knows about everything.
If you are a certain famous
large software companies
and have all your code into a big monopo,
you might be fine.
Or even if it's not in a monorepo,
but every build,
every repo gets built in the same system
in a single workspace.
There's one build system that knows how to build everything.
So like if you use CMake has a feature called fetch content,
which it actually will go get caught.
If you did that for everything, actually you might be okay.
Assuming CMake knew how to within a CMake build,
like map these modules to each other.
It gets difficult when you have to jump into another build system context.
And that could still be the same style CMake, but just a different build directory, if that makes sense. Like, it's enough that you had a different process modeling the mappings between this, what we call source-to-source dependencies
versus source-to-binary dependencies,
where the idea that you have different projects
that can introspect this source tree of each other
and be able to pass information from one build system
for a build system to know about information
from one source tree to the other source tree
versus environments like ours
or like every Linux
distribution where you essentially have these separate steps where you build a project,
you generate the built artifact, you deploy that artifact, and now you consume that artifact
from a different build system without knowing the original source tree or anything.
And that's where the adoption of modules is right now.
All the testimonials that we have of module adoption right now are in the environments of what we call the source-to-source dependency environment,
where the single build system can see the entire thing,
and then it can build the entire graph, and it can run the entire build.
Right. All of the issues you're discussing,
I was just trying to place my memory here. I believe Izzy Muerta was bringing up a lot of these issues in like 2018.
Yeah. And I hadn't really heard anyone talk about it since then. So a part of me was just going,
oh, cool. Someone solved these problems and now I don't have to worry about it.
But it doesn't sound like really anything happened.
That's my guess as to why we're not talking about it.
Because everyone's like, oh, we got, those people are so smart.
Izzy's like, you know, Izzy knows everything.
Yeah, if Izzy's on it, I'm sure it's fine, right?
And I can't explain it so that we're just like, hey, let's start a conversation.
And so, you know, that's where we ended up.
We're like, why isn't anyone talking about this?
We would ask around, like, have you been talking about this?
Have you been talking about this?
Like, why isn't there a paper for SG-15?
And someone said, well, nobody wrote the paper.
We're like, oh, okay.
So you want a paper?
Okay, we could do that.
Right.
So, you know, Daniel did it.
But, you know, I said we.
I mean, we can't underestimate the impact that like 2020 had on everything.
So I think that probably played a huge role there too.
Yeah.
That's,
that's probably fair.
So have you proof of concepted any of these like specific,
like you have a proposal for what a,
a normalization of these things might look like.
Have you,
have you implemented those?
So I have in that paper, the second paper.
Yes. So let me clarify one thing. So we talked about normalizing package management, and we
talked about the quarter loaf solution, as Brad stated, of just making modules work with the
current ecosystem. So the paper 24 to73 is actually trying to achieve the latter,
which is let's see how we can make modules work
in an environment that looks like the environment we have today.
And the way that that looks is we essentially have a proposal
where modules kind of look like header deployments,
where you can map the module include through like a path on the file system.
You have a search path
and you have a specific translation
from the module name to the path on disk.
But then because modules have the separate parsing context
for the module interface unit,
and by that, I mean the dash I's
for the code consuming a module don't have to match, or I should even say shouldn't match the dash I's of the parsing of the module interface itself.
And so this proposal includes what we call a meta IXX info file that describes here's how you parse this interface. And then the proposal also comes with the convention
on how to name the BMI file,
such that if you find the file with that name,
you know it's compatible and it's the right one.
And then if you don't find it,
you can just use GNU makes VPath to say,
oh, it's not here.
I'm just going to produce my own
and now I can consume that module.
So in the process of writing that paper, we actually wrote a little toy project using GNU Make that's linked from the paper to exercise and show, look, we can make this work with GNU Make.
We just need this couple extra steps to generate the module dependency, which would essentially be like a C++ modules config utility that scans this module search path, that for the case of this toy project
was the C++ modules
MakeMake, which would essentially generate
the rules for producing all those
BMI files that you would need.
But this is all with
a mock compiler and
mock parts. You could do that.
Probably, you'd just want CMaker
or something to know how to do all that.
We're also talking to Kitware about like,
how can we get that going or,
you know,
can we try,
try this out?
Hopefully Sunday,
there'll be some,
something published in that space.
But I'm just going to ask,
like,
have you gotten much feedback from build system implementers or compiler
implementers on some of these proposals?
Yeah,
we work pretty closely with Kitware in particular,
cause we use so much CMake and they're the core maintainers of CMake, people didn't know. So yeah, we work pretty closely with Kitware in particular because we use so much CMake. And they're the core maintainers of CMake,
if people didn't know.
So yeah, we've sponsored some work in there
the last couple of years for different things
that we've started with, like QuickWins especially,
and just general CMake workflows.
Like, I don't know, some easy ones to explain
would be environment variables you can set
for CMake build types.
You don't have to just add dash GC CMake build type to every stupid compile command you do in the whole.
You can just say, just put in your dot files, stop setting it, it's fine.
Yeah, we've seen some of those in the more recent releases, release notes come up, yeah.
Yeah, some of those are Bloomberg sponsored, because we're like, again, that's not cattle.
I don't want to have to go to all my CI setups and add the dash D, whatever, to every single one of them.
That kind of stuff.
So we're talking to them about modules.
There's care work people in the SG15 meetings, keeping tabs on things too.
So as far as that goes, we're pretty confident.
We'll figure something out there.
The other build system, there are other build system implementers in the SG15 meetings.
They haven't objected saying,
I could never do that.
I can never implement it.
Not yet anyway.
I wouldn't say every single person
that's ever worked on a build system
or cares has been in the meetings.
So I would encourage people
that care a lot about build systems
to at least subscribe to the mailing list
and provide feedback in other channels,
even if you can't make the meetings.
I guess the other question I have is,
sorry, just real quick,
does this require actual ISO buy-in,
or is it just something that the different compiler vendors,
build system vendors kind of need to agree on
to have a set of standards?
There's a long answer, but the short answer is
ISO doesn't really do specs for build systems
or packaging systems or anything like that.
So Daniel kind of said earlier,
the best we can do is publish a report saying,
hey, we polled and we all think this is a paper
that you could read.
And we like it as a paper
that you could read.
I guess I'm not an expert on ISO workings
and how all that works.
Possibly you'd need a whole other ISO group to just be like build systems, C++ build systems versus the language.
But that might be a different guess could give you a better answer to that question.
But that being said, like, yes, I mean, the real end goal, if we're talking about outcomes versus like the agile things, what are we actually trying to where we're actually going here?
But you're,
you're right out right on the nose.
Like the real thing is we want to know that the ecosystem is going to be
there,
that they're at least not saying I'll never do it.
It's like,
if you really,
like we would live with,
if you twisted my arm and you donated time or money or volunteer work or
something,
I guess we can make that happen.
A lot of times that's what,
something that people need to struggle with as engineers
because we'd like the perfect, the pristine, the, you know,
like let's get all perfect and then we'll ship it.
A lot of times what we really need for these kinds of consensus-driven things
is, yeah, I can live with that.
You know, like I would sleep fine at night if that's what was happening.
And that's kind of what we're shooting for.
That's the expectation.
Some people will be perfect, I'm sure.
But, like, we want nobody to be it. We want everybody to be at least in. I can live shooting for. That's the expectation. Some people will be perfect, I'm sure, but we want nobody to be it.
We want everyone to be at least in,
I can live with it.
That's actually really important, right?
Because anything short of that,
and we start ending up fracturing the ecosystem,
we'll have modular C++ people
and non-modular C++ people,
and then the Rust people over there
gloating at everybody else.
We don't want that.
I think one very positive thing
that has come out uh over like
the rewarming of sg15 over the past few months is that we're we're seeing like more and more people
from like really important projects being more involved with with sg15 i think in the last
meeting we had someone from red hat, which from the package management perspective, having someone from a Linux distribution is kind of important.
We had Visual Studio Olga from Microsoft in the meeting.
Having Visual Studio in the table is really important if we're talking about converging things.
We had CMake and we have other people that are starting to be more present on SG15
and I think this is a good sign.
Right, right. So if you're not involved,
fear of missing out.
Just keep that in mind.
Is it possible Red Hat
actually invented the modern package?
Before that, everything was just a tarball?
I think chronologically
speaking, Debian came before.
Really? Yeah, I think Debian speaking Debian came before really?
Debian was created in 1993
and they used packages
you know that Debian
is called Debian because it was
Jan Murdoch and his wife
was called Debra
so it's Debian
there's a flashback when you
mentioned Red Hat getting involved because it's my first real job in 97.
I actually bought the RPM manual because I had to generate some Red Hat packages, and there was no automated tooling for that yet.
Anyhow.
There's uneven automated tooling now, so it's better.
Right. There's uneven automated tooling now, so it's better.
That's kind of what we'd like to see is someday,
like if you just ran a, you know, whatever you build system,
but I know CMake, so I'll use that.
CMake and then CMake install,
and you end up with something that looks like a package and you just tar it up and you're done.
Right.
And that's where we'd like to get.
I think we can get there.
Yeah, a lot of sticking points these days are about things,
like Daniel said, how do we get file systems?
Again, there's no files in the ISO standard.
If we have the interface, do we want to actually say the file needs to be named this way and this part of the system?
How does that work?
Sticking points include, could we do more sophisticated solutions?
Gabby Dos Reyes had a really interesting talk at cpp con in october about something called ifc i can't remember what the acronym stands for but it has to do with
a interchange format whereby you can serialize the abstract syntax graph of a compiler's parsed
knowledge of some source code right then save in a standard format and some other compiler or tool
like an IDE or Clang to IDE or something can go open that up without having to do all that extra
work, just like walk that graph and decide things. The ideal outcome being you could write, you know,
some Python tree walking code and do interesting source script analysis things without having to
actually have like Clang, tight lip Clang or something linked into your executable. Great.
It's really interesting. That'd be great for Bloomberg. We'd love for that.
We'd love to have that. But I guess
like, do we want to wait 10, 20
years for that to propagate to all the tools we
need it to? I don't know, right?
Let's start on it, sure. But let's get something
else in the meantime kind of thing.
I think Visual Studio can admit that today,
I believe, for the
sake of our listeners here.
It might not be production ready for a large-scale company like yours.
No, C++ modules on Visual Studio are implemented in terms of that format.
Okay, right.
Again, we'd be okay if enough of the ecosystem all got together and said,
okay, we're going to sponsor these groups to go add these pull requests to these compilers.
And we thought, hey, we can get this done in five years. Yeah, the challenge that I see there
is the reason why the AST changes
between different versions of Clang is not necessarily because the
C++ language changed, although, like, if you adopt different standards, you would
of course have different abstract syntax trees, but even
on the same language standard standard from Clang 12 to Clang 11,
there are differences in the AST.
And the reason for that is, oh, it turns out that there are some niche cases
that can't be represented in the AST as it was before.
So the AST was adapted to be able to represent those valid C++ use cases.
So one other way of thinking about this
is that this standardized serialization format
would need to go through the same level of rigor
that we have on the C++ language
when it comes to the laxer and the parser, right?
Because you would need to be able to specify
to an extent where you say,
if it can't be represented by this abstract syntax tree,
it's not valid C++.
And that's a really, really hard challenge.
And I'm personally supportive.
Like if we want to like move the C++ language standard
to include like this standard abstract syntax representation of the language,
not just at the level of the tokenizer and the parser,
I'm all for it.
But it's a substantially big amount of work.
Like the W3C DOM or something.
Yeah, a lot like that, actually.
Okay, well, hopefully, at some point in the future,
build systems, compilers, and everything will converge
and we'll start to be able to get to use modules more.
Is there any other changes that we haven't talked about
that you're hoping to see from modules,
either from the standard or from compilers and build systems?
We need support from the build systems.
We are working with Kitware for CMake
to support building modules internally first
and then to have support for shipping
and consuming module libraries
outside of CMake to CMake scenarios
because Kitware is fine just adding more things
to the CMake export file.
And then if you're consuming from CMake, it's all fine.
But we don't really want to have like that tight dependency
on everyone using the same build system.
So having more implementations on the build system side
is definitely important.
My expectation right now is that the language feature is
fine. I don't think we have any problems
today that are caused by the language level
specification. So now it's more mechanically, how
do we make it such that the human can go write code, ship
an artifact to another human to consume?
On the contrary, we have a lot of problems
that our modules are supposed to solve, right?
So if you're really excited,
if I didn't have to worry about people
setting inconsistent binary important flags
across different translation units,
that would be fantastic.
Things like how big is your long and things like that, right?
It's your long.
Like there's, you know know i'm talking about like dash d use this large file yeah or like oh that's here's
a fun bit of info um you'll hear a lot of people say that like the standard c++ version is abi
stable is abi unimportant right you can change your standard plate which is actually not true
at scale when you get to a certain scale,
which isn't that big, actually,
you'll find that some libraries
start doing feature detection
using template metaprogramming, right?
Like if you have lambdas available,
use this implementation,
otherwise use this other implementation
and those will be ABI
and compatible implementations.
I'm hand waving a lot, right?
But you'll see it at a certain scale.
You see it already in popular libraries like
Abseil, like Bloomberg has something
BDE, which is our kind of analog
Abseil or Boost or something, right?
And so those libraries do that. Everybody depends on
those libraries, and so all those things are
technically ABI-unstable, depending on what standard
version you set. So point being, for
modules, at least you wouldn't have to worry about
that thing leaking
out a little bit, maybe.
It seems like it would be moving in the right direction
and we can relax a lot of things.
Preprocessor-driven
ABI breakages are going
to be way easier to manage
after modules. We'll just have to worry
about within the
things that you texture include, is it a problem?
I don't know how to convert all the
things to modules yet necessarily,
but assuming we were there,
like there's a lot of problems we could solve and there's problems we could
even contain. Like, okay,
maybe we can only get half the things modularized anytime soon,
but that at least contains those fires to like subsections of the code base
or something. Right. Yeah. So, and that's our day to day.
I mean, if that wasn't clear, Daniel and I do a ton of like, okay,
how do we get all of our code,
like including open source code to do the new compiler
or to remove this kind of ABI unstable thing or something.
We didn't have a chance to really talk about ABI
after running out of time.
All right, you can have us back on some other time.
I'll talk your ear off about ABI if you want.
Oh, goodness gracious.
I'll say Marshall Cloud's talk
was pretty on the nose
from our perspective, I'd say. I mean, when he was
on here a while ago about
ABI, that's pretty on the nose.
There's a lot of weird edge cases
and at a certain scale, it's really hard.
It's really easy to underestimate
how complicated it is
and a lot of people think that
it's fine to ignore the problems.
It's especially important at the OS level.
Because we use OS, like vendor-provided,
OS-provided compilers and standard libraries
and some low-level libraries like Zlib, things like that.
Those things really need to be ABI-stable.
We build the rest of the world from source in a certain respect.
You can check our talk on packaging that Daniel and I did
this October at CPPCon
about enough
detail that you can probably
figure out where we're coming from
as far as that goes.
This is all related, so modules should help.
Gabby does race talk, talks about,
hey, ABI is going to be a lot better if you use modules.
I think he's right, but how do we get there?
That kind of thing.
All right. Well, Brett and Daniel, it was know, that kind of thing. So. All right.
Well, Brett and Daniel, it was great having you on the show today.
Thank you so much for coming on.
Yeah.
Thanks for having us.
It's been fun.
Thanks a lot.
Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast.
Please let us know if we're discussing the stuff you're interested in or if you have a suggestion for a topic.
We'd love to hear about that, too.
You can email all your thoughts to feedback at cppcast.com.
We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter.
You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter.
We'd also like to thank all our patrons who help support the show through Patreon.
If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode is provided by podcastthemes.com.