CppCast - Distributing C++ Modules

Starting point is 00:00:00 Thank you. exclusively for cpp cast jetbrains is offering a 25 discount for purchasing or renewing a yearly individual license on the c++ tool of your choice cline resharper c++ or app code use the coupon In this episode, we talk about object lifetimes and agile development. Then we talk to Daniel and Brett from Bloomberg. Daniel and Brett talk to us about modules and what changes need to be made to use them. Welcome to episode 329 of CppCast, the first podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today? I'm all right, Rob. How are you doing?

Starting point is 00:01:42 Doing okay. Getting ready for the holiday season. Yeah. If you're ordering Christmas presents, you need to order them today, I think. I think I'm all done with my Christmas shopping. Are you done or you still got more to do? I think I'm done. I'm pretty sure I'm done. That's good. I actually started a couple months ago. I'm like, ooh, I'm gonna order that before I forget about it. Okay. Well, at the top of every episode, I always read a piece of feedback. We got a lot of great feedback from last week's episode. It was a lot of fun talking to Kate and Guy.

Starting point is 00:02:16 This tweet's from Rob Bernstein saying this episode was fantastic. I didn't want to turn it off. I love the interaction between the four of you and all the journeys into different topics. I've pre-ordered the book. Can't wait to dive in. Yeah. We got a lot of feedback on Twitter for that in. We got a lot of feedback on Twitter. We got a lot of good feedback.

Starting point is 00:02:33 The one thing we did get also was the most Reddit comments on any post ever, I think. But it was not about the show at all. I saw a comment. Yeah. 200 posts of just someone ranting about Rust versus C++. That had nothing to do with the episode. That sounds great. Yeah.

Starting point is 00:02:51 So I was a little sad to see that there were that many comments that had nothing to do with the show. So welcome. We've got lots of feedback, though. Yeah. Well, we'd love to hear your thoughts about the show. You can always reach out to us on Facebook, Twitter, or email us at feedback at cppcast.com. And don't forget to leave us a review on iTunes or subscribe on YouTube.

Starting point is 00:03:11 Joining us today is Daniel Russo. Daniel is the manager for code governance at Bloomberg with a focus on driving large-scale static analysis and automated refactoring. Daniel has been working the past 20-plus years with a persistent lens on how to help engineers be more effective with build, deployment, and analysis tooling on various different environments and languages with a more recent focus on bringing C++ modules to a state where they can be used by more people. Daniel is a Brazilian music nerd

Starting point is 00:03:38 and will talk endlessly about that if you let him. He also plays classical guitar. Daniel, welcome to the show. Thank you for having me. Yeah, welcome. How large is the Code Governance team, if I might ask? So we have around 25 people between full-time employees and contractors. We're driving, we have a static analysis pipeline that runs around 20 projects per second,

Starting point is 00:04:03 running various tools across the entire code base that we can see. Wow. So it's been a journey. We have a lot of tools that we write in-house for specific refactorings, for getting rid of old dead code, for feature switches that are considered fully deployed,

Starting point is 00:04:22 and for removing blockers for migrating to newer platforms and those kinds of things. So it's been a wild journey. Wow, sounds like that's pretty crazy. Yeah. Also joining us today is Brett Brown. Brett is the team lead for the Bloomberg Build Tools team, focusing on compilation tool chains, build systems, and large-scale code migrations. Brett likes improving C++ software development by treating projects more like cattle and less like pets. In October at CVBCon, Brett presented a talk on CMake modules and co-presented a talk on packaging C++ with Daniel. Both Brett and Daniel are actual participants in the ISO Tooling Study Group.

Starting point is 00:05:00 Brett is a lead and founding member of the C++ Guild at Bloomberg. He contributes to Bloomberg Working Groups on C++ tooling, testing, conferences, deprecations, and ISO engagement. Brett, welcome to the show. Hey, glad to be on. Can you give us the elevator pitch on what you mean between more like cattle and less like pets? Yeah, I mean, it seems like there's a lot of reasons for this, but it seems like a lot of C++ projects people treat as a special thing, and then they end up adding special build rules, special directory layouts, all sorts of things. And at the scale we're operating at Bloomberg, like Daniel was talking about, that gets to be really, really expensive. And even at a smaller scale, if you have a side project, you probably

Starting point is 00:05:39 don't want to sink a lot of time into just keeping things running, right? So really what people want is something a little more regular so that they can use the same CI pipeline on all their projects. They can use the same analysis techniques. It works with all their IDEs. And a lot of the bespoke things that people add are actually the things that get in the way of that. So a lot of what I focus on in my free time, my fun projects are ways to get those kinds of pain points out of the way to let people have all the features they want without having to bake those features into the physical project they're working on, if that makes sense. So my talk about CMake modules is, for instance, a technique to get your bespoke CMake out of your project and have that live as something you can reuse across your projects. Very cool.

Starting point is 00:06:21 All right. Well, Brett and Daniel, we've got a couple news articles to discuss. Feel free to comment on any of these, and then we'll start talking more about the work you're doing at Bloomberg with modules and everything, okay? All right. First thing we have is a new version of Boost. It's always worth mentioning these. This is version 1.78, and lots of changes to ASIO

Starting point is 00:06:45 file system, BoostBeast. I don't think there's any new libraries here, are there? I didn't see any new ones, no. No new libraries added. Anything in particular anyone wanted to point out in all these changelogs? I think it's interesting

Starting point is 00:07:01 there's a breaking change to a regex those are parsed. Oh, I missed that. That's interesting, There's a breaking change to a regex. Those are parsed. Oh, I miss that. That's interesting. Yeah. Capital B versus lowercase B as per Perl behavior. That's, um, that's, that's kind of fascinating. Cause I mean, there's like a lot of boost regex code out there right now.

Starting point is 00:07:18 Right? Like if one of the projects that I'm working on, I might still use boost regex. I think we may have moved it to standard.regx. I'm not sure now. But if it's still using Boost.regx and if it had this in here, it might be a while before that was noticed that it broke a parsing in some of our code. I'm not too familiar. I don't know regex well enough to know what this breaking change means with slash capital B or slash lowercase B.

Starting point is 00:07:47 I don't know what that means. Slash B is the word boundary. Okay. And then capital B would be not a word boundary, right? Okay. So just like lowercase S is any white space, capital S is any non-white space. Okay. So they're changing to make it look more like Pearl or less like Pearl?

Starting point is 00:08:06 More like Pearl is what it seems to be. I've got a confession to make that I'm a long time Pearl guy. That's, um, so you speak like Hubert basically. I also find it interesting that there's so many changes to as you are, because, you know,

Starting point is 00:08:22 we've talked a lot about the possibility of standardizing networking and with the reluctance to ever break ABI in the standard, but it does seem like something like ASIO, it's a moving target. I mean, they're constantly updating the network libraries and it just makes me question, does networking actually have a place in the standard or not? There was a conversation on C++ now in 2019

Starting point is 00:08:45 where we were talking about this idea of adding graphics library to the standard. And at some point, I was making the comment, do we really want a GCC to depend on Xorg or Wayland? Right. And it's kind of like a ridiculous proposition on the face of it. But I understand why people want it because the lack of package management in C++ means that the only way for you to reasonably do this in a way that's not full of bespoke solutions is to actually put it in the standard.

Starting point is 00:09:17 I guess since we don't have a package manager, I guess everything has to go in the standard so we can do it in a uniform way. Yeah, even for less controversial features like threading support, we've actually had problems like adopting a Bloomberg using standard stuff because there's implicit dependencies between your stable standard library and things that are significantly less stable, like 1TBB, right? Yeah, libTBB does not have any API promise at this point. So if you're using execution from C++17 and your ship, you're executable in a non-containerized environment, it's very likely that you don't know that you risk breaking at runtime. Right. All right. Jason, do you want to talk about this next one?

Starting point is 00:10:03 Because this next news item is coming from you, right? So I thought, what the heck, I'm going to go ahead and make it into a full-on puzzle book. So I just released my first puzzle book. It is called Object Lifetime Puzzlers Book 1, which implies that I plan to release more than one of them. Did any of you have the chance to look at this at all, by the way? Yeah, I'm really bad at these things. Like there's this Twitter post about what does this do? Oh, no, no, no, no, no, no. Let's be clear. No, no, no, no. This is explicitly

Starting point is 00:10:49 not Shafiq's polls. Okay. These are fun. Well, it was fun. So I did. I turned on The Sound of Silence by Simon Agarifunkle and I started doing the puzzle and I got Be Sure to Drink your Ovaltine.

Starting point is 00:11:05 Is that the right answer to the, to the sample you had? I think that's right. Yeah, that's right. That's right. That's right. Yeah. So, uh, if you, if you actually do get the book, like I do recommend people get the print copy of it because it is a puzzle book and you want to write in it.

Starting point is 00:11:20 Um, it actually does explain, uh, it explain, it covers automatic lifetime, static lifetime, dynamic, and thread local object lifetime. And it's all just like sorting out when does an object lifetime end? And when does it begin? When does it begin? When does it end? And then you fill in the puzzle and it explains it all along the way. So my niece, who's actually visiting at the moment, happened to be here right when the proof copy came from Amazon. So my niece, who's actually visiting at the moment, happened to be here right when the proof copy came from Amazon. So I just handed it to her. She has absolutely no experience or interest in programming at all and had to ask two questions from me and then just tore into it and spent like the next three hours doing the puzzles. She had like a lot of fun with that.

Starting point is 00:12:03 That's really neat. Yeah. I mean, we get this stuff. These are the kind of edge cases we hit when we're upgrading a compiler. And the compiler suddenly changes when the destructor gets called. There's always a trigraph involved. I mean, not a trigraph, like a ternary operator involved for some reason.

Starting point is 00:12:19 But yeah. Well, it shouldn't. Anything that happened, if you're not talking a global static, then your compiler was broken previously because these things are well defined. That's the point of C plus plus. Right.

Starting point is 00:12:33 Yeah. So I'm not arguing with you. A lot of times our bugs are like, this is weird edge case. And it's not, it was broken before. And now it's hopefully less broken now, but it's still kind of broken, but it's changing, it was broken before and now it's hopefully less broken now, but it's still kind of broken,

Starting point is 00:12:46 but it's changing or it's broken in a different way. None of these, none of these straddle the edge of a stud of the line of edge cases. They all work on all compilers. It's all verified code. So that's good though. You know, maybe we should pass this around Bloomberg.

Starting point is 00:13:01 One of the migrations we're doing is to get rid of the other kinds of statics you were just talking about. The code base, the global ones. Yeah. Maybe we should start passing this around Bloomberg. One of the migrations we're doing is to get rid of the other kinds of statics you were just talking about in the code base. The global ones. Yeah. Maybe we should start passing this around Bloomberg. It's like, do these things instead. Because that's what we need. We're like, okay, if you want static lifetimes, these are the defined things.

Starting point is 00:13:15 Stop putting them in the top of your header files and stuff like that. So I did actually, for the record, intentionally meant this to be fun. So hopefully at least some people have fun with that. Like I said, I know non-programmers can have fun with it. My actual concern is that C++ programmers aren't going to have fun because they're going to question everything that's in there. And trust me, it is all well-defined behavior. Like, just get that out there. There's no, like, I even had one of my patrons almost immediately be like, I don't think that's defined behavior.

Starting point is 00:13:46 I'm like, yes, it is. I've been teaching this for five years. It is. But C++ programmers don't trust the compiler. People who aren't programmers just trust the rules that I put in the book. All right. Yeah. The person probably even started with, um, actually.

Starting point is 00:14:00 No, no, no. All my patrons are really cool. Okay. And then the last thing we have here is a blog post from Kevin Henley, who was definitely a great guest we had on a while ago. And this is, you know, he's kind of having a take on Agile and how maybe we're not tracking the right things with Agile. Does anyone have a better description of this one? So I've been using Agile for like a really long time like i started with extreme programming back in like 2001 uh kind of thing and this is consistently been the hardest thing to explain to people that like the point of us measuring those things is not to make it go faster.

Starting point is 00:14:49 The point is to make sure that when I'm talking to the customer that I'm saying, we think that we're going to deliver this in four weeks from now, that I'm actually going to deliver that four weeks from now. And that it's better to say, I'm not going to deliver, than to say, I'm going to deliver and not do it. And this is consistently the thing that people get wrong when thinking about Agile is just thinking that the point is to go faster. But in my experience, the main point is to be predictable. I think it's also good that Agile helps you think about where am I going? Am I still going the right

Starting point is 00:15:24 way? You have little checkpoints to do that sort of thing. And the point, as he says in the article, isn't to be fast and sprint, per se. It's to have those shorter increments where you're thinking, okay, what about now? Am I doing the right thing now? And not wait a month, a year, or worse to decide, oh, wait, that whole time I was just wasting my time.

Starting point is 00:15:42 I should have done something else. So his analogy to vectors and mathematics was really effective. I thought that was a good way to explain it. Like you need, yeah, you want a vector that has a long magnitude in this thought experiment, but you also want it to have the right direction, right? That's the point of a vector. It's not just, so if you're talking about how fast are you going

Starting point is 00:16:03 in a two-dimensional space or something, it doesn't matter where you point it. It's not just how long the arrow is that you're drawing or something. And that goes for software projects, too. Are we going the right way? Going the right way is way more important than going fast, because you can go fast in the wrong way and end up worse off than you started. So that was a really good point. And if we're continuing the analogy that you're making there, we also kind of want to find what the unit vector is to get that predictability.

Starting point is 00:16:34 That's a good point, I didn't think about that. I have a standard rant on this conversation. If you go to my LinkedIn, you'll find this article, I can put a link somewhere later, I guess, on how most people miss the point of Agile. And I think this article was really interesting in bringing that point. All right. So Brett and Daniel, I guess what's a good place to start? Because you have been working on switching to modules at Bloomberg, and it sounds like you've kind of faced some challenges, have some ideas for things that might need to change in order for others to properly use modules. Is that right?

Starting point is 00:17:19 Yeah, I guess we could start chronologically. Okay. From Daniel and my perspective, in the last few years, we've had really good success going to a more standard, more regular build system, treating our projects like cattle, et cetera, at Bloomberg. We've been using CMake as a core technology for that. We've been engaging with Gitware on certain kinds of CMake enhancements. I was doing some research, and Daniel was jumping in, too, on, okay, well, this module stuff seems cool. I see these articles about we're almost ready, or we have this branch in this one project where you can maybe do modules with your build system. So I was approaching it from, like, okay, what do I need to do?

Starting point is 00:17:55 What do I need to plan for the next year to three years or something for, like, okay, how do we get modules running when our compilers are caught up at Bloomberg, right? And I didn't see the places to hook up to implement a build system to do modules the way we need to do them at Bloomberg. In particular, I didn't see the way to integrate the layer between packaging and building. It's a very important layer. It goes both directions. Like you pull in information from your environment to build, and then you provide information to the destination as a package when you ship. And that information wasn't there. Your dependency, you depend on a module. You say import foobar. What does that mean? You go to some file system somewhere. There's some interface source

Starting point is 00:18:42 file somewhere. You compile it somehow, like that information wasn't there. And then in the standard, to be fair, doesn't say anything about any of this, it talks about what goes inside of your source file. In fact, the standard doesn't say it's a file, it just says it's, you know, a translation unit, right? And it was very explicit about not saying these things, right? Right. Okay. So the challenge we're concerned about is like, okay, I don't know, as someone that does a lot of this, how to get modules working at Bloomberg, basically, right? So Daniel took the lead on writing a paper, and that's P2409, if I got the number right, to kind of describe what we're worried about and why we think we're kind of blocked to even get started on that. And a lot of that gets into the details of what I was just saying there. And so that paper was presented to study group 15 at ISO, the tooling working group,

Starting point is 00:19:37 study group, sorry. And we've done subsequent conversations since. That's probably a good introduction, I think. There's a lot to drill into exactly what the problems are, which use cases are not supported yet, and maybe where we can go with that. But I think that's a good intro. I think there is one iconic aspect of this, which is that most people think, most people that are not involved in the modules conversation

Starting point is 00:20:04 or implementation, think that part of the implementation that is happening is that there's going to be like an interoperable format where you can ship like a module file as like a pre-parsed thing that anyone can consume. And so it's going to be super fast because you don't have to parse that anymore. But the reality is that with the exception of Visual Studio, the scope of compatibility for modules right now is the same for the module interface. The binary module interface files right now is the same as the module, the scope of compatibility of precompiled headers, which is kind of like literally the same compiler binary. Like even a different build of the compiler may result in it being like an incompatible format. So that when we have those conversations,

Starting point is 00:20:54 like I always say that we have a lot of these like, oh moments where it's like, you explain some things and it's like, oh. Yeah. Like how would you call Clang Tidy on Tidy with a dependency that comes from your system? So let's back up a second. Bloomberg, when it ships packages, uses almost exactly what you've probably seen already in Linux distributions. We use just normal Linux-style package management.

Starting point is 00:21:18 You would install a library on your system. Your build system would know how to find that library in your system using either a combination of conventions and maybe some metadata like package config or something like that. Right. So that's what we do at Bloomberg. Part of the reason we're escalating this is because it's clearly not a Bloomberg specific problem. Like everyone's going to have this problem. And we were kind of wondering, why isn't anyone talking about this? I thought people would have issues open on this problem or something. And so that's why we're trying to, you know're talking about making sure people to raise awareness about we didn't want to do packaging

Starting point is 00:21:50 in C++ standard yet at least, but we got to figure out some kind of packaging standards really, at least when it comes to shipping pre-compiled libraries that contain modular interfaces. So anyway, so going back to Clang-Tidy, so let's say you installed a package like Boost, right?

Starting point is 00:22:08 And you want to build, but you want to analyze your code, right? But that code says import Boost.fileSystem, right? Your Clang-Tidy is more or less for the purposes of this conversation, a compiler, right? But you don't necessarily have a build system, right? Most people point out a compile commands.json file or something like that people point out a compilecommands.json file or something like that.

Starting point is 00:22:27 But that compilecommands.json file is describing how to execute with whatever your production compiler is or whatever your local development compiler is, which probably isn't Clang-tidy. It's maybe a different version of Clang at best. It could be GCC, it could be ICC, it could be anything else.

Starting point is 00:22:43 So those compile flags are not compatible between compilers. And there's no, like Daniel was saying, there's no standard metadata that's pre-compiled sitting on the machine that all those different tools can go look at to say, okay, what's inside this module? What are the declarations? What are the classes, et cetera? And keep in mind that modules can have pound includes in them. So you also need to have all the classic old flags of preprocessor definitions, include search paths, things like that. So that's a use case for, okay, well, how do we get Clang-tidy to work? But it goes for everything.

Starting point is 00:23:15 How does your IDE know what's inside that module? How does your other analysis tools? Yeah. other analysis tools. Yeah, there's like, and again, like how do you get different packages in a directed graph of dependencies? How do you get those packages to know okay, well, this thing is exporting that interface from this other module

Starting point is 00:23:36 as part of its interface. You have to go walk the transitive graph to find all this metadata for everything. And we have a particular interest in making sure that we can do that without a very tightly coupled integration into the build system. Like I talked before about how we do like 20 projects per second static analysis. We can do that because we extract only the information about translation units through like an image that we send through a distributed compute platform that executes all those

Starting point is 00:24:07 static analysis. But now if we're saying that you need to integrate this deeply into the build system because the build system needs to tell Clang-Tidy how to produce the different module interfaces because your compile

Starting point is 00:24:24 command just receives the path to the BMI that is compatible with that compiler. So how do you go about introspecting a build system from the outside to be able to run a static analysis tool if the only thing you see is the path to the BMI

Starting point is 00:24:38 that's compatible to this compiler? Right. Another way to phrase that is in the real world, you often have to take, you have to use multiple build systems in parallel, maybe spread out over time to build a C++ project, like all in all of its dependencies, right? So there's built systems need to have an interoperable format to say, okay, I'm a build system and I provided this module pre-built,

Starting point is 00:25:00 here you go. And there's some other build system because they, oh, okay, okay. You have this module and has these classes. I got it. Right. But like, that's something we don you go. And there's some other build system because they, oh, okay, okay. You have this module and has these classes. I got it, right? But like, that's something we don't have. So that's what Daniel's second paper is about, is about how to just kind of describe that broadly and more importantly, also how to discover that. Like, okay, so there's a file in the system now

Starting point is 00:25:18 in the second paper, at least that's proposed. There's file in the system now. It has some properties of a compile commands.json file and that'll have like include directories and stuff now. It has some properties of a compile commands dot JSON file and that'll have like include directories and stuff. And it's just, it has some properties

Starting point is 00:25:28 of a package config PC file. And that has, it's kind of discoverable. You can use, you can model a directed graph of dependencies. You can discover the dependencies without having to

Starting point is 00:25:38 transitively declare all of them in your build system, which is something that people struggle with now, which is like, okay, so you want to, you want to add this library. Well, guess what? You have to go find all of its transitive dependencies and add them also to your build system, which is something that people struggle with now, which is like, okay, so you want to add this library.

Starting point is 00:25:45 Well, guess what? You have to go find all of its transitive dependencies and add them also to your build system. And that's a real pain in the neck. And that's something, again, that's not cattle. And worse, the people you're depending on can't change things anymore because that breaks your build because now you pinned a thing that they didn't pin, and it's horrible, right?

Starting point is 00:26:01 So anyway, that's the problem we're talking about. So everything is terrible. It's just a lot of work. It's not, you know, I want to make sure people realize there's a lot of work. There's plenty of room for people to come in and try things out, implement them, share their use cases.

Starting point is 00:26:19 We're talking about this in the SG15 mailing list. That's something people can ask to subscribe to and they can see some of these conversations. We've been beating the bushes around the community, trying to get more people from relevant parts of the C++ ecosystem involved in these conversations in the scope of SG15. We've had some good success there from different kinds of vendors that provide packaging, vendors and other organizations

Starting point is 00:26:44 that provide packaging systems, provide build systems, provide tool chains. To some degree, the impact on this, other than a lot of these things are breaking, one of the things I'm also concerned about would just be what happens if one of the vendors figures out that something that works really well for them and then another vendor finds something else out that works really well for them. And then vendor finds something else out that works really well for them and then we bloomberg have to figure out a third thing that works for us and then guess what we're like do modules really work at that point like on paper at the language spec sure but in practice as an

Starting point is 00:27:16 end user like if you're doing a training class and you say it's really easy if you want to do modules you just do import foobar and then there's this 10 000 lines you need to add your cmake list it's really easy. And you have to hard code all these flags for all the things you ever pull in. Part of the point of all this is that we start getting out of that habit and just pushing it into the build system isn't really a solution. So we kind of need to come up with a way

Starting point is 00:27:35 to kind of describe these things. So you all have mentioned a bunch of different compilers and a bunch of different build tools so far. And just out of curiosity, what are you actually focused on using GCC visual studio? See, make, is that your main concern?

Starting point is 00:27:49 Like where's the focus? Um, all the above. Okay. We have a, we have a, like I said, we have a Linux style distribution.

Starting point is 00:27:56 Basically. Um, we take open source projects as tarballed and released by them in most cases. And maybe with patches as few as possible, we'll actually build those and provide them to our engineers. So we do have a few, I mean, you name it, we probably have at least a few projects using that build system, for example.

Starting point is 00:28:15 As far as toolchains go, we mostly target Unix Linux systems, although some of our teams do support Windows in different ways. We don't want to be necessarily pinned to that subset of the ecosystem forever. Like it is nice to have the option to do more windows or to ship to max or something at some point. Right.

Starting point is 00:28:32 So GCC recently says that they support the C++ 20 modules, but clang is still doing clang style modules, or is this that difference affecting you all? Yeah, we're, we're, we're using pretty stable compiler builds right now so it's not even it's more of a medium to long-term goal that we would even use modules at bloomberg it's like seriously okay um so this is me staying ahead of my concern is like if we don't start working now like we're going to be talking about a decade or two at bloomberg right not not just like waiting for GCC or Clang to catch up to some feature. Okay, got it. Yeah, our main concern is not necessarily the compiler support, although that is absolutely necessary for anything to

Starting point is 00:29:15 happen. But the main thing is, how do we make it actually work when actual people are actually typing in their editors kind of thing. Actual people involved. If you remove those from the equation, I think this would all be much easier. And the compilers, right? Yes. Remove the people in the compilers. We have no problems. Yeah.

Starting point is 00:29:37 So, I mean, we're right in the gear house of like, hey, this is exactly the kind of organization that should really benefit from modules. We'll have really deep dependency trees. It'd be great not to have to pre-process like all the world just to, you know, do a hello world kind of thing. You know, it'd be nice to include IO stream without blowing up your build times

Starting point is 00:29:55 exponentially or something. Right. But like, it's like, well, if I can't put it in the build system, that doesn't matter though. Right.

Starting point is 00:30:01 So, um, yeah. Wonder if the discussion for just a moment to bring you a word from our sponsor. CLion is a smart cross-platform IDE for C and C++ by JetBrains. It understands all the tricky parts of modern C++

Starting point is 00:30:14 and integrates with essential tools from the C++ ecosystem, like CMake, Clang tools, unit testing frameworks, sanitizers, profilers, Doxygen, and many others. CLion runs its code analysis to detect unused and unreachable code, dangling pointers, missing typecasts, no matching function overloads, and many other issues. They are detected instantly as you type and can be fixed with a touch of a button while the IDE correctly handles the changes throughout the project.

Starting point is 00:30:40 No matter what you're involved in, embedded development, CUDA, or Qt, you'll find specialized support for it. You can run debug your apps locally, remotely, or on a microcontroller, as well as benefit from the collaborative development service. Download the trial version and learn more at jb.gg.cppcast-clion. Use the coupon code JETBRAINS for CppCast during checkout for a 25% discount off the price of a yearly individual license. So are you getting kind of good feedback from the committee with these papers? You know, are others recognizing this as a problem? Yes. Short answer is yes. The long answer is the awkward part about this is that a lot of the problems that modules have is because we don't have consensus right now is precisely on whether we should compromise for, like, can we just make modules work? Because the second paper,

Starting point is 00:31:53 the P2473, it is about, let's make modules work with the ecosystem as it is now. And the place where we don't have a consensus is, is that a good idea? Like, shouldn't we just push ahead and actually fix the underlying package management problem, which would be great. And I started some conversations in the SG15 mailing list and with some people in other package management systems. But it's an immensely complex problem to solve. So I'm not super confident that if we put a quarter of operations to be first package management, then solve the modules that will not have something usable before 2030 or something.

Starting point is 00:32:38 So the half a loaf we're shooting for, or a quarter of a loaf, or however much of a loaf we're talking about, I guess, is maybe we can start with initial standards for... Not a standard, that's the wrong word, because again, an ISO, that's a special word that means special things. But, you know, a document describing what a package would look like for a C++ project, which no such document exists. Like, what is a package? Like, is this a good package? Could you look at a package and say, this has the stuff you need to be a C++ package? Forget where it comes from. Forget how

Starting point is 00:33:07 it's compressed. Forget how you download it. Forget what its name is. But if you looked at it, does it have the minimal amount of metadata possible? That's not something that has been done before, as far as I'm aware. It should be really good for the package management ecosystem.

Starting point is 00:33:24 Maybe someday it could be something we build on to make an actual package. I don't know if I ever have a package manager. It'd be nice to at least have interoperable packaging such that you enable the new package manager instead of port to it, if that makes sense. Because that's kind of the experience right now. It's like, okay, I'm going to port to Conan

Starting point is 00:33:41 or I'm going to port to VC package or I'm going to port to make a build so I can know, Arch ecosystem stuff like it's a porting kind of exercise, just like going to a new OS or something, right? Yeah, if we could get to the point where we standardize, and again package managers communicate with build systems, if just that language becomes more interoperable, it would be a huge step forward. Because build system would have, instead of having to export a package config file, a CMake export file, a Conan file, you would be able to just export this one language and then other build systems would be able to read that language and be able to consume those libraries. But even that, like when you start talking, like I was talking with Tobin from this PAC project

Starting point is 00:34:42 and he deals with all the HPC world, which is way more complicated than I ever thought could be. And all the ABI zoo that they have makes any serious attempt of answering that question way more complicated than one would expect. And the thing that is really the challenge here is when you look at folks using a system like we do, for instance, which is like a simple package management

Starting point is 00:35:15 deployed with Linux file system hierarchy standards based deployment, package config is kind of good enough, right? Like you have dash I's, you have dash L's, you have dash D's and that's okay. So it's a really hard lift to say like, oh, you know, all these people that have a really like convenient solution

Starting point is 00:35:38 that works well for them because they have a very specific scenario, that's not good enough. You've got to go to this very much more complicated thing because in order for us to normalize things, you need to take care of the use cases that you don't care about. Right, right.

Starting point is 00:35:54 And the use case that's the simplest is, okay, just throw everything into the user include directory and hit the build button, right? Like that in particular is just not going to work going forward. Like, again, you have to, the build system actually has to model the graph of the things you depend on instead of just say here, a compiler,

Starting point is 00:36:08 here's a dash I flag have fun. Right. So it gets, yeah, again, actually, we didn't actually make a fine on this detail, but again,

Starting point is 00:36:17 Daniel kind of over it, but the big thing we want, one of the big things we want for modules is that you don't have to go parse that code over and over again. Right. We want to say I parsed that once, maybe even like an hour ago. I'm not, I just grab it. I don't want to reparse that, right? For that to work, you need a file on the disk somewhere or a database or something that has that information cached. For that to

Starting point is 00:36:37 exist, your build system has to tell your compiler to make it. So, and for that to work, again, you have transitive dependencies. The build system has to be able to walk the graph and make sure all those files exist. So you're building things that aren't in your project, but not as textually included into your source file. That's something that's new here. That's something people sometimes ask us. What's new about modules? Why isn't this a problem already? And that's one of the answers is like, well, we want to be able to only build the one time. And so to build only the one time, you have to know when to build the one time. So you have to know when to, so you have to say like, oh, I don't have a binary module interface

Starting point is 00:37:10 file for boost file system. Make one, put it in your build directory, and we'll use that going forward. And like Daniel said, depending on if you have different compilers, you might have to have a set of those and know which one to pick. And worse than that, depending, like, we still have textual inclusion because especially C libraries and stuff will have that, so you still need to, you know, tweak those if somebody has slightly different dash D flags or something. So we'll need like check sums to know, this is not the one you want, you want this other one that use the dash D something equals one instead of dash D something equals zero, that kind of thing. Right. The good news is all that should be cleaned up over time. But in theory, as long as you're not including C APIs too much.

Starting point is 00:37:49 So there's a lot of work here. I liked a lot of work coming out of the talks I saw at CppCon in October. There were a lot of great talks, a lot of cool things about, hey, we can use concepts this way and that way. A lot of them kind of had this implication like, yeah, this will all be header-only kind of, if you do it this way, but we'll have modules someday. So when we have modules, like, we don't care about header-only anymore, right? It's all

Starting point is 00:38:09 just a module interface. It's still private. You don't have to worry about leaking your stuff into other people's ecosystems or blowing up their compile times, right? Because modules will solve that. It's like, well, yeah, maybe, if we could get adoption at a certain scale, and that's kind of what Daniel and I were talking about. Like, okay, well, how do we actually get there? What's the roadmap?

Starting point is 00:38:25 So we're kind of flexible on the roadmap to some degree, but we would like to, like we just talked about with the Agile thing, it's nice to take little steps if you can, instead of trying to like, you know, boil the ocean up front. I'm not opposed to boiling the ocean if we can all get on the same page about boiling the ocean, though, I guess.

Starting point is 00:38:41 Sounds like an ecological disaster, though. Well, I guess we are kind of on the ocean slowly, aren't we? Yeah, slowly. Okay. I just want to take like a two steps back to make sure for like my own sake here that I'm please. Yeah. Okay.

Starting point is 00:38:56 The current state of the art for modules means that if I want to use modules for my work right now, I have to rebuild that package on my system with my compiler, with my flags to get that module. And then I can use it moving forward. And there's simply no way to distribute these things to anything other than my specific build configuration. Right. The current state of the art is there are a few build systems. And I think build two's advertised as being one of them and cmake there's some branches that people are doing research on maybe you can whip up your own you know gmake version of your own thing daniel has an example of how you might do that in

Starting point is 00:39:34 his second paper p 24 to 73 right um anyway the concern the state of the art is and these are the kinds of things you see in these talks about modules and like, hey, look at the compile times I got. They're so much better. You have one build system and it knows about everything. If you are a certain famous large software companies and have all your code into a big monopo, you might be fine.

Starting point is 00:39:58 Or even if it's not in a monorepo, but every build, every repo gets built in the same system in a single workspace. There's one build system that knows how to build everything. So like if you use CMake has a feature called fetch content, which it actually will go get caught. If you did that for everything, actually you might be okay.

Starting point is 00:40:16 Assuming CMake knew how to within a CMake build, like map these modules to each other. It gets difficult when you have to jump into another build system context. And that could still be the same style CMake, but just a different build directory, if that makes sense. Like, it's enough that you had a different process modeling the mappings between this, what we call source-to-source dependencies versus source-to-binary dependencies, where the idea that you have different projects that can introspect this source tree of each other and be able to pass information from one build system

Starting point is 00:40:57 for a build system to know about information from one source tree to the other source tree versus environments like ours or like every Linux distribution where you essentially have these separate steps where you build a project, you generate the built artifact, you deploy that artifact, and now you consume that artifact from a different build system without knowing the original source tree or anything. And that's where the adoption of modules is right now.

Starting point is 00:41:30 All the testimonials that we have of module adoption right now are in the environments of what we call the source-to-source dependency environment, where the single build system can see the entire thing, and then it can build the entire graph, and it can run the entire build. Right. All of the issues you're discussing, I was just trying to place my memory here. I believe Izzy Muerta was bringing up a lot of these issues in like 2018. Yeah. And I hadn't really heard anyone talk about it since then. So a part of me was just going, oh, cool. Someone solved these problems and now I don't have to worry about it. But it doesn't sound like really anything happened.

Starting point is 00:42:06 That's my guess as to why we're not talking about it. Because everyone's like, oh, we got, those people are so smart. Izzy's like, you know, Izzy knows everything. Yeah, if Izzy's on it, I'm sure it's fine, right? And I can't explain it so that we're just like, hey, let's start a conversation. And so, you know, that's where we ended up. We're like, why isn't anyone talking about this? We would ask around, like, have you been talking about this?

Starting point is 00:42:25 Have you been talking about this? Like, why isn't there a paper for SG-15? And someone said, well, nobody wrote the paper. We're like, oh, okay. So you want a paper? Okay, we could do that. Right. So, you know, Daniel did it.

Starting point is 00:42:35 But, you know, I said we. I mean, we can't underestimate the impact that like 2020 had on everything. So I think that probably played a huge role there too. Yeah. That's, that's probably fair. So have you proof of concepted any of these like specific, like you have a proposal for what a,

Starting point is 00:42:58 a normalization of these things might look like. Have you, have you implemented those? So I have in that paper, the second paper. Yes. So let me clarify one thing. So we talked about normalizing package management, and we talked about the quarter loaf solution, as Brad stated, of just making modules work with the current ecosystem. So the paper 24 to73 is actually trying to achieve the latter, which is let's see how we can make modules work

Starting point is 00:43:30 in an environment that looks like the environment we have today. And the way that that looks is we essentially have a proposal where modules kind of look like header deployments, where you can map the module include through like a path on the file system. You have a search path and you have a specific translation from the module name to the path on disk. But then because modules have the separate parsing context

Starting point is 00:43:56 for the module interface unit, and by that, I mean the dash I's for the code consuming a module don't have to match, or I should even say shouldn't match the dash I's of the parsing of the module interface itself. And so this proposal includes what we call a meta IXX info file that describes here's how you parse this interface. And then the proposal also comes with the convention on how to name the BMI file, such that if you find the file with that name, you know it's compatible and it's the right one. And then if you don't find it,

Starting point is 00:44:35 you can just use GNU makes VPath to say, oh, it's not here. I'm just going to produce my own and now I can consume that module. So in the process of writing that paper, we actually wrote a little toy project using GNU Make that's linked from the paper to exercise and show, look, we can make this work with GNU Make. We just need this couple extra steps to generate the module dependency, which would essentially be like a C++ modules config utility that scans this module search path, that for the case of this toy project was the C++ modules MakeMake, which would essentially generate

Starting point is 00:45:30 the rules for producing all those BMI files that you would need. But this is all with a mock compiler and mock parts. You could do that. Probably, you'd just want CMaker or something to know how to do all that. We're also talking to Kitware about like,

Starting point is 00:45:46 how can we get that going or, you know, can we try, try this out? Hopefully Sunday, there'll be some, something published in that space. But I'm just going to ask,

Starting point is 00:45:54 like, have you gotten much feedback from build system implementers or compiler implementers on some of these proposals? Yeah, we work pretty closely with Kitware in particular, cause we use so much CMake and they're the core maintainers of CMake, people didn't know. So yeah, we work pretty closely with Kitware in particular because we use so much CMake. And they're the core maintainers of CMake, if people didn't know. So yeah, we've sponsored some work in there

Starting point is 00:46:11 the last couple of years for different things that we've started with, like QuickWins especially, and just general CMake workflows. Like, I don't know, some easy ones to explain would be environment variables you can set for CMake build types. You don't have to just add dash GC CMake build type to every stupid compile command you do in the whole. You can just say, just put in your dot files, stop setting it, it's fine.

Starting point is 00:46:33 Yeah, we've seen some of those in the more recent releases, release notes come up, yeah. Yeah, some of those are Bloomberg sponsored, because we're like, again, that's not cattle. I don't want to have to go to all my CI setups and add the dash D, whatever, to every single one of them. That kind of stuff. So we're talking to them about modules. There's care work people in the SG15 meetings, keeping tabs on things too. So as far as that goes, we're pretty confident. We'll figure something out there.

Starting point is 00:47:01 The other build system, there are other build system implementers in the SG15 meetings. They haven't objected saying, I could never do that. I can never implement it. Not yet anyway. I wouldn't say every single person that's ever worked on a build system or cares has been in the meetings.

Starting point is 00:47:15 So I would encourage people that care a lot about build systems to at least subscribe to the mailing list and provide feedback in other channels, even if you can't make the meetings. I guess the other question I have is, sorry, just real quick, does this require actual ISO buy-in,

Starting point is 00:47:31 or is it just something that the different compiler vendors, build system vendors kind of need to agree on to have a set of standards? There's a long answer, but the short answer is ISO doesn't really do specs for build systems or packaging systems or anything like that. So Daniel kind of said earlier, the best we can do is publish a report saying,

Starting point is 00:47:53 hey, we polled and we all think this is a paper that you could read. And we like it as a paper that you could read. I guess I'm not an expert on ISO workings and how all that works. Possibly you'd need a whole other ISO group to just be like build systems, C++ build systems versus the language. But that might be a different guess could give you a better answer to that question.

Starting point is 00:48:16 But that being said, like, yes, I mean, the real end goal, if we're talking about outcomes versus like the agile things, what are we actually trying to where we're actually going here? But you're, you're right out right on the nose. Like the real thing is we want to know that the ecosystem is going to be there, that they're at least not saying I'll never do it. It's like, if you really,

Starting point is 00:48:33 like we would live with, if you twisted my arm and you donated time or money or volunteer work or something, I guess we can make that happen. A lot of times that's what, something that people need to struggle with as engineers because we'd like the perfect, the pristine, the, you know, like let's get all perfect and then we'll ship it.

Starting point is 00:48:50 A lot of times what we really need for these kinds of consensus-driven things is, yeah, I can live with that. You know, like I would sleep fine at night if that's what was happening. And that's kind of what we're shooting for. That's the expectation. Some people will be perfect, I'm sure. But, like, we want nobody to be it. We want everybody to be at least in. I can live shooting for. That's the expectation. Some people will be perfect, I'm sure, but we want nobody to be it. We want everyone to be at least in,

Starting point is 00:49:07 I can live with it. That's actually really important, right? Because anything short of that, and we start ending up fracturing the ecosystem, we'll have modular C++ people and non-modular C++ people, and then the Rust people over there gloating at everybody else.

Starting point is 00:49:21 We don't want that. I think one very positive thing that has come out uh over like the rewarming of sg15 over the past few months is that we're we're seeing like more and more people from like really important projects being more involved with with sg15 i think in the last meeting we had someone from red hat, which from the package management perspective, having someone from a Linux distribution is kind of important. We had Visual Studio Olga from Microsoft in the meeting. Having Visual Studio in the table is really important if we're talking about converging things.

Starting point is 00:50:00 We had CMake and we have other people that are starting to be more present on SG15 and I think this is a good sign. Right, right. So if you're not involved, fear of missing out. Just keep that in mind. Is it possible Red Hat actually invented the modern package? Before that, everything was just a tarball?

Starting point is 00:50:20 I think chronologically speaking, Debian came before. Really? Yeah, I think Debian speaking Debian came before really? Debian was created in 1993 and they used packages you know that Debian is called Debian because it was Jan Murdoch and his wife

Starting point is 00:50:38 was called Debra so it's Debian there's a flashback when you mentioned Red Hat getting involved because it's my first real job in 97. I actually bought the RPM manual because I had to generate some Red Hat packages, and there was no automated tooling for that yet. Anyhow. There's uneven automated tooling now, so it's better. Right. There's uneven automated tooling now, so it's better.

Starting point is 00:51:06 That's kind of what we'd like to see is someday, like if you just ran a, you know, whatever you build system, but I know CMake, so I'll use that. CMake and then CMake install, and you end up with something that looks like a package and you just tar it up and you're done. Right. And that's where we'd like to get. I think we can get there.

Starting point is 00:51:20 Yeah, a lot of sticking points these days are about things, like Daniel said, how do we get file systems? Again, there's no files in the ISO standard. If we have the interface, do we want to actually say the file needs to be named this way and this part of the system? How does that work? Sticking points include, could we do more sophisticated solutions? Gabby Dos Reyes had a really interesting talk at cpp con in october about something called ifc i can't remember what the acronym stands for but it has to do with a interchange format whereby you can serialize the abstract syntax graph of a compiler's parsed

Starting point is 00:51:59 knowledge of some source code right then save in a standard format and some other compiler or tool like an IDE or Clang to IDE or something can go open that up without having to do all that extra work, just like walk that graph and decide things. The ideal outcome being you could write, you know, some Python tree walking code and do interesting source script analysis things without having to actually have like Clang, tight lip Clang or something linked into your executable. Great. It's really interesting. That'd be great for Bloomberg. We'd love for that. We'd love to have that. But I guess like, do we want to wait 10, 20

Starting point is 00:52:31 years for that to propagate to all the tools we need it to? I don't know, right? Let's start on it, sure. But let's get something else in the meantime kind of thing. I think Visual Studio can admit that today, I believe, for the sake of our listeners here. It might not be production ready for a large-scale company like yours.

Starting point is 00:52:47 No, C++ modules on Visual Studio are implemented in terms of that format. Okay, right. Again, we'd be okay if enough of the ecosystem all got together and said, okay, we're going to sponsor these groups to go add these pull requests to these compilers. And we thought, hey, we can get this done in five years. Yeah, the challenge that I see there is the reason why the AST changes between different versions of Clang is not necessarily because the C++ language changed, although, like, if you adopt different standards, you would

Starting point is 00:53:20 of course have different abstract syntax trees, but even on the same language standard standard from Clang 12 to Clang 11, there are differences in the AST. And the reason for that is, oh, it turns out that there are some niche cases that can't be represented in the AST as it was before. So the AST was adapted to be able to represent those valid C++ use cases. So one other way of thinking about this is that this standardized serialization format

Starting point is 00:53:51 would need to go through the same level of rigor that we have on the C++ language when it comes to the laxer and the parser, right? Because you would need to be able to specify to an extent where you say, if it can't be represented by this abstract syntax tree, it's not valid C++. And that's a really, really hard challenge.

Starting point is 00:54:16 And I'm personally supportive. Like if we want to like move the C++ language standard to include like this standard abstract syntax representation of the language, not just at the level of the tokenizer and the parser, I'm all for it. But it's a substantially big amount of work. Like the W3C DOM or something. Yeah, a lot like that, actually.

Starting point is 00:54:43 Okay, well, hopefully, at some point in the future, build systems, compilers, and everything will converge and we'll start to be able to get to use modules more. Is there any other changes that we haven't talked about that you're hoping to see from modules, either from the standard or from compilers and build systems? We need support from the build systems. We are working with Kitware for CMake

Starting point is 00:55:12 to support building modules internally first and then to have support for shipping and consuming module libraries outside of CMake to CMake scenarios because Kitware is fine just adding more things to the CMake export file. And then if you're consuming from CMake, it's all fine. But we don't really want to have like that tight dependency

Starting point is 00:55:36 on everyone using the same build system. So having more implementations on the build system side is definitely important. My expectation right now is that the language feature is fine. I don't think we have any problems today that are caused by the language level specification. So now it's more mechanically, how do we make it such that the human can go write code, ship

Starting point is 00:56:03 an artifact to another human to consume? On the contrary, we have a lot of problems that our modules are supposed to solve, right? So if you're really excited, if I didn't have to worry about people setting inconsistent binary important flags across different translation units, that would be fantastic.

Starting point is 00:56:20 Things like how big is your long and things like that, right? It's your long. Like there's, you know know i'm talking about like dash d use this large file yeah or like oh that's here's a fun bit of info um you'll hear a lot of people say that like the standard c++ version is abi stable is abi unimportant right you can change your standard plate which is actually not true at scale when you get to a certain scale, which isn't that big, actually, you'll find that some libraries

Starting point is 00:56:48 start doing feature detection using template metaprogramming, right? Like if you have lambdas available, use this implementation, otherwise use this other implementation and those will be ABI and compatible implementations. I'm hand waving a lot, right?

Starting point is 00:57:02 But you'll see it at a certain scale. You see it already in popular libraries like Abseil, like Bloomberg has something BDE, which is our kind of analog Abseil or Boost or something, right? And so those libraries do that. Everybody depends on those libraries, and so all those things are technically ABI-unstable, depending on what standard

Starting point is 00:57:17 version you set. So point being, for modules, at least you wouldn't have to worry about that thing leaking out a little bit, maybe. It seems like it would be moving in the right direction and we can relax a lot of things. Preprocessor-driven ABI breakages are going

Starting point is 00:57:34 to be way easier to manage after modules. We'll just have to worry about within the things that you texture include, is it a problem? I don't know how to convert all the things to modules yet necessarily, but assuming we were there, like there's a lot of problems we could solve and there's problems we could

Starting point is 00:57:48 even contain. Like, okay, maybe we can only get half the things modularized anytime soon, but that at least contains those fires to like subsections of the code base or something. Right. Yeah. So, and that's our day to day. I mean, if that wasn't clear, Daniel and I do a ton of like, okay, how do we get all of our code, like including open source code to do the new compiler or to remove this kind of ABI unstable thing or something.

Starting point is 00:58:12 We didn't have a chance to really talk about ABI after running out of time. All right, you can have us back on some other time. I'll talk your ear off about ABI if you want. Oh, goodness gracious. I'll say Marshall Cloud's talk was pretty on the nose from our perspective, I'd say. I mean, when he was

Starting point is 00:58:27 on here a while ago about ABI, that's pretty on the nose. There's a lot of weird edge cases and at a certain scale, it's really hard. It's really easy to underestimate how complicated it is and a lot of people think that it's fine to ignore the problems.

Starting point is 00:58:44 It's especially important at the OS level. Because we use OS, like vendor-provided, OS-provided compilers and standard libraries and some low-level libraries like Zlib, things like that. Those things really need to be ABI-stable. We build the rest of the world from source in a certain respect. You can check our talk on packaging that Daniel and I did this October at CPPCon

Starting point is 00:59:05 about enough detail that you can probably figure out where we're coming from as far as that goes. This is all related, so modules should help. Gabby does race talk, talks about, hey, ABI is going to be a lot better if you use modules. I think he's right, but how do we get there?

Starting point is 00:59:22 That kind of thing. All right. Well, Brett and Daniel, it was know, that kind of thing. So. All right. Well, Brett and Daniel, it was great having you on the show today. Thank you so much for coming on. Yeah. Thanks for having us. It's been fun. Thanks a lot.

Starting point is 00:59:34 Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in or if you have a suggestion for a topic. We'd love to hear about that, too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help support the show through Patreon.

Starting point is 01:00:10 If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode is provided by podcastthemes.com.

Your Ad Here

CppCast - Distributing C++ Modules

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.