CppCast - libstdc++

Episode Date: May 16, 2025

Jonathan Wakely joins Phil and Timur. Jonathan talks to us about libstdc++ (GCC's standard library implementation), of which he is the lead maintainer, and tackles some tough questions like ABI compat...ibility - and how GCC and libstdc++ approach it. News GCC 15 released (release notes) Boost.OpenMethod review (finished) 2025 Annual C++ Developer Survey "Lite" (closed) Links GCC Mailing Lists

Transcript
Discussion (0)
Starting point is 00:00:00 Episode 398 of CBPCast, recorded 8th of May 2025. In this episode, we talk about GCC 15, Boost Open Method, and the 2025 annual C++ Developer Survey. Then we are joined by Jonathan Wakeley. Jonathan talks to us about Lipstitch C++, the first podcast for C++ developers by C++ developers. I'm your host, Timo Dummler, joined by my co-host, Phil Nash. Phil, how are you doing today? Oh, all right, Timo, how are you doing?
Starting point is 00:01:02 And how was your vacation? Not too bad, thank you. My vacation was fantastic. We spent two weeks in Uzbekistan where I have family. So my dad is from Uzbekistan. So quite a lot of family there. That was a fun kind of family trip. So that was a good time. Thank you again so much, Phil, for doing the last episode with guest co-host Anastasia while I was away. I really appreciate that. Also listen to it. So great to listen to a new episode of CP Custed. I don't know yet. Don't get to do that very often. That was a lot of fun. Great episode. Thank you very much for that.
Starting point is 00:01:36 Very welcome. So how are you, Phil? What's new on your side? Not a lot. I still seem to be recovering from post-conference flus and things like that. But other than that, just building up for the next ones. All right, cool. So at the top of every episode, we'd like to read a piece of feedback. And this time we received an email from Luciano regarding the last episode about AI tools with Daisy Holman. Luciano writes, great episode and really a good time to talk about those tools. I was hoping to hear a little bit more about the concerns regarding intellectual property of code
Starting point is 00:02:11 generated by AI agents and what we lose from the creative process of problem solving as we start relying more and more on those tools. Phil, do you have any opinion on this kind of feedback? Yeah, so well, first of all, we were just starting to get into some of the negative aspects towards the end. We were already running quite long, so we had to cut it short. It's unfortunate that we didn't get into that because if you listen to the show, you'll know Daisy was actually quite prepared to talk about some of the controversial aspects as well. So maybe we will have to have a follow- up episode sometime because there's definitely a lot of concerns like this that people have. So it'd be good to discuss them.
Starting point is 00:02:50 Personally, I'm not sure where we stand as a community on this yet. There's obviously some quite negative aspects to it, but we haven't really thought through all of the details and what that actually means. So difficult to just have a complete opinion on it right now, I would say. Well, I have some thoughts on this. As a free software developer who writes all my code, most of my code covered by the GPL, I'm definitely concerned about the intellectual property side. I think that's a huge, possibly a time bomb that is going to have a reckoning or it just won't because the companies that are generating this, writing these generative agents just have such deep pockets that they can probably do it and win any court cases.
Starting point is 00:03:40 But that's a huge issue that would take the whole cast if we were to talk about it. And I'm not an expert on that side. But I think the second part of the question is very interesting too. Not only losing the creative process of problem solving, but something I've been thinking about recently is the side of training and mentoring for new developers, new hires. That creative process of problem solving is an incredibly important part of the learning process of how you become a senior developer and advance your career. And if everybody's using generative agents to do the sort of the drudge work
Starting point is 00:04:20 or the simple, you know, oh, this is easy. I can just get Copilot to do this for me. Where do the interns get their experience? Where do the junior developers figure out how to solve these problems for themselves? Because in 10 or 20 years, where are we going to get senior developers from if no one learned to solve problems and learned to write bad code and learned to write code that needed review and needed replacing and needed fixing. I have concerns about that. That's definitely very valid concerns.
Starting point is 00:04:50 I remember though, if I remember correctly, Phil, you did discuss this point at the end of the episode. And I think Daisy had a different opinion. She said that it would actually be better for like onboarding juniors and faster to help them do that with AI in certain aspects where they still have to figure out the rest by themselves or something like that vaguely. Is that about right, Phil? Yeah, there was definitely a different take on it. And who knows how that's really going to play out. I think that the point is we don't really know at this point. We're just entering unknown territory and we should have these
Starting point is 00:05:29 concerns, but we should also be open to things playing out differently. Yeah. All right. So it's possible that the AIs will be great mentors. I just, yeah, I don't know. I think we're going to find out. You will indeed find out one way or another. So that's going to be interesting. We'd like to hear your thoughts about the show. You can always email us at feedback at cppcast.com. And joining us today, as you've already heard a little bit now is Jonathan Wakeley. So Jonathan works for Red Hat as the lead maintainer of LipsTit C++, which is the C++ standard library for GCC.
Starting point is 00:06:05 And he's also the current chair of the library working group in WG21, which is the C++ standards committee. Jonathan, welcome again to the show. Hi, thanks for having me. So, yeah, Jonathan, we were talking just before the show that you've been at Red Hat for 12 years, did you say? 11 and a bit, yeah. 11 and a bit, well, rounds up to 12.
Starting point is 00:06:22 12 years, did you say? 11 and a bit, yeah. 11 and a bit, well, it rounds up to 12. Is the only thing that you're doing there working on Lipster C++ or do you have actually any other involvement with? It's the main thing I do, but I also do some packaging for Walford Aura and Red Hat Enterprise Linux. So I take care of the packaging for boost and Intel building blocks. I think that's it. Maybe just a couple of tiny header only packages, one of which is my own
Starting point is 00:06:54 package, which got into Fedora. So yeah, I'd spend the vast majority of my time working on Libsyd C++, but then now and then Red Hat will say, we need you to do this. And I'll stop doing my main job. All right. So we will get more into your work in just a few minutes. But before we do that, we have a couple of news articles to talk about. So if you want to comment on any of these, please feel free to do so. I expect that maybe you might have a comment or two on the first news item, which is the release of GCC 15. So last time we spoke about the clang
Starting point is 00:07:27 20 release, actually GCC 15, which is another major compiler release got released also just before Phil recorded the last episode, but it was so shortly before the recording that didn't actually make it into the news item. So we're going to do it this time with a little bit of a delay. Yeah, the release notes are out. We're going to put a link in the show notes. As always with the major release, lots of new stuff on the C++ side in particular, they implemented lots of C++ 26 features on the language side. So that's amazing because that's not even out yet. That's not even finished yet. We haven't even finished working on it on the committee, right? They've already implemented a bunch of stuff. So in GC15, you get on the language side pack indexing,
Starting point is 00:08:12 which is really cool. That's when you have a pack with dot, dot, dot. You can do a square bracket index into it. You don't have to write the recursive template thing anymore. So that's super cool. You get hash embed. That's a big one.
Starting point is 00:08:24 That is a big one. You can put structure binding declarations in to initialize us off if while for and switch statements, you got constexpr placement new on the library site, you got a bunch of new views like views concat to input and cache latest and you get constexpr sorting algorithms, which is very, very cool. And also you get more support for existing standards, like the latest ones, uss 23 wasn't quite completely 100% supported yet. But now there's just a support is just a little bit more complete. Now you get p2644, which is the fixed for range based for loops. So you no longer get UB if you have like a temporary there in certain cases, which is also the same fix that Clang20 did, which you talked
Starting point is 00:09:13 about just last week. You got also on the standard library set, stood flat map and flat set. So those are really cool, really useful for performance sensitive code. You can now import the entire standard library as a module, which is also very cool. And then apart from language and library support, there's, of course, loads of features just on the compiler side, both front end and back end. There's better auto vectorization. You get incremental link time optimization,
Starting point is 00:09:42 which quite significantly reduces the average recompilation time when you do LTO and then edit your code and then do LTO again. There's lots of other features and improvements. In particular, on the backend, there's a huge list and also the other programming languages that GC supports. There's lots of new stuff there too. So it's a really, really big release. So I guess Jonathan, you were involved in that, right?
Starting point is 00:10:05 So congrats for this amazing release. Do you have any comments on this at all? Sorry, we released it to surprise you. So you didn't have time to talk about it last time. We just wanted to get in there before Clang. We released around this time every year. to get in there before Clang. We release around this time every year. So we have a sort of not entirely fixed schedule, but basically we aim to have it late April or early May. And it was a little bit earlier than previous years this time. We basically wait till we've got no open regression since GCC
Starting point is 00:10:41 14, the last major release. And then on the day that happens, we're like, everybody stop doing anything, and branch for a release. So as soon as you've got no regressions, everything's fine, right? It's no fun as well. So how often do you release major versions of GCC? It's every year.
Starting point is 00:10:58 Every year. Once a year. Yeah. OK. And then we'll do 15.2 in a few weeks with the new regressions that we didn't find before the release. And then that's sort of, so yeah, the second release of GCC 15 will be quite soon, probably.
Starting point is 00:11:15 And then it'll be a few months after that. We'll do the next round with the next round of little bug fixes and regression fixes. But we've already gone ahead and started working on GCC 16. Now that 50 is out, we're like, right, stop breaking everything and then spend the next year re-stabilizing and get ready for GCC 16. So do you have a favorite thing that's gone into GCC 15?
Starting point is 00:11:39 I think possibly the STD module. But there's a lot of new formatting stuff as well. I really like std format. And thanks to Tomasz Kaminski, we've now got the container formatting, range formatting, tuple formatting, all that sort of thing, which was the main bit of std format that I had not done for GCD 14. 13 added some of it and then 14 added some more. But I kept finding the container stuff too hard. Yeah, great for testing, printing out the contents of a vector or something.
Starting point is 00:12:16 Yeah, it's super useful and I wanted it. I just didn't manage to do it myself, but it's there for 15. So I'm excited about that. Yeah. And it's always good when we don't talk about these things just in terms of somebody wrote a paper or the committee approves a paper, but like, it's there for 15, so many sides to that. Yeah. And it's always good when we don't talk about these things just in terms of somebody wrote a paper or the committee approved a paper, but it's actually out there. And in the compiler, you can use it today.
Starting point is 00:12:32 That's very, very cool. So congrats again on this release. We have two more news items, which I very quickly want to go over before we go to the main part of the episode, which is talking more about GCC and the standard library of GCC. So one of the other two news items I want to mention is a nascent boost library. It's not yet an official boost library, but it will hopefully be a boost library soon. Boost open method. So it's now in review. So they have this particular procedure where you can submit a library to become
Starting point is 00:13:03 a boost library, and then it's in review. And then you have a review manager, I think they call it, and a bunch of people reviewing it. And then at the end, if you get a thumbs up, you fix whatever they told you to fix, and then it goes into boost. So there is a new one of those boost open method. So it's proposed for boost. And this one is quite a generic library. So everybody is actually welcome to contribute a review. So if you are listening to the show, you can go and review this library and give feedback to the author, say whether you recommend to reject or accept it into Boost, whether you
Starting point is 00:13:34 suggest any conditions for acceptance into Boost, or you have any other feedback on this library. But go check it out, it would be very helpful for both the author of the library and the Boost community and I guess everybody else using it. You can contribute a review on the Boost mailing list, or you can just directly get in touch with the author. Are you going to put a link to the show notes, I guess, to how to do that, or maybe some place where you can figure out how to get in touch. We will put the links in the show notes, but I think the review may have already finished at this point. Oh, okay. So I saw this on Reddit like a couple days ago. Yeah, so it's going to continue till 7th of May. Oh, that was yesterday. As we record. So, but I'm sure the author would still appreciate any comments if you have them. Right. You're not part of the official
Starting point is 00:14:23 review. So, so actually, I should mention what the library is actually about, not just the process. Um, so the library adds open methods to C++. So that's pretty cool. That's like virtual functions, except they're defined outside of classes. Yes. So you can override them, but they're not part of a class. They're freestanding functions. That is really powerful technique. You can do a lot of stuff with it. It makes it easy to implement multiple dispatch, I guess that's kind of one of the use cases people probably be most familiar with. There's lots of others. Like using it allows you to avoid God classes in many situations,
Starting point is 00:14:59 visitors. So it's an alternative to that to provide a solution to the expression problem and the banana gorilla jungle problem, which I have to admit I was not familiar with until I have seen this read and read and then I looked up what the banana gorilla jungle problem is. Are you two familiar with this? Never heard of it. That's a quote from Joe Armstrong, creator of O-Lang. Complaining about O-O languages, where he said he wanted a banana, but what he actually got was the gorilla holding the banana
Starting point is 00:15:30 and the whole jungle with it as well. Often when you just want to reuse a piece of code in an O-O hierarchy, you get the whole hierarchy along with it as well. Right. So yeah, it sounds like a really powerful kind of paradigm design pattern, open methods. So very, very cool. It also says that it's fast. It's as fast as virtual functions are in C++ today. So you get just, what is it, two levels of indirection. So yeah, pretty fast. And yeah, that's that. Yeah. So this library has actually been around for a while. It's based on Yom2, I think is the original name. Something I've looked at a few times in the past, where I've occasionally had the need for multi-methods.
Starting point is 00:16:16 And Yom2 seems to be the one that does it. It's interesting that they actually say that most uses of it are actually for the single dispatch. So just using an external virtual function. And in that case, it's actually more like a type erasure object. Not exactly the same thing, but there's a strong overlap there. So it's interesting to see a lot of effort going into this space. All right. And the last news item I want to talk about today is that the 2025 annual C++ Developer Survey light, that's how it's called, with light in quotes, is now out.
Starting point is 00:16:52 It's the yearly survey by the Astana C++ Foundation. When I say it's out, I mean it's open. You can participate. You can fill in the survey. Please go ahead and do that. That is very, very helpful. It takes only about 10 minutes to complete the survey, which please go ahead and do that. That is very, very helpful. It takes only about 10 minutes to complete the survey. A summary of the results will be posted publicly. And these kind of surveys are really important because they make sure that the... So this
Starting point is 00:17:15 one is by the Stanadz Sibostos Foundation, which is kind of the organization behind a lot of the committee work. So it makes sure that the committee stays in touch with the WordPress C++ community. It also helps lots of other people like tool vendors and others to provide you with the features you actually want and need. We actually got three such services as far as I'm aware. There is the San Jose Foundation one, which is this one.
Starting point is 00:17:40 There's the JetBrains one. There's the meeting C++ one, the one that Jens Veller runs. So those three, they're all yearly. And it's also interesting to see the differences between the three. There's a bunch of questions like which standard version I do use, which IDEs do you use and things like that. So go fill it out.
Starting point is 00:17:57 Very important to have those. They're very helpful for everybody. Yes, and this one is currently open as we record. I believe it's actually closing on Friday, which is when this should be released. So hopefully, we'll be a short window where people may hear this and still have time to actually go and complete it. But it's going to be tight, so don't hang about. All right. All right. So that concludes the news items. So we can move on to the main topic of today, which is to talk to Jonathan Wakely. Hello again. Hello. About your work on the GCC implementation of the standard library. And so the idea to invite you goes actually back to two emails you received from listeners.
Starting point is 00:18:45 One was from Paul Luckner back in August 24 and the other one was from Abe Mishler just a few weeks ago. I think Phil, you mentioned it in the last episode. Both of those people were interested in kind of different aspects of C++ and wanted to hear from an actual maintainer and actual developer of the library about certain things. So we decided to get Jonathan on the show and to talk about those things. And hopefully we will be able to cover those questions that those people had and more. So thank you again, Jonathan, for joining us today and taking the time to talk to us. Sure. Yeah. Let's see if we can cover all of
Starting point is 00:19:22 the questions about Libs to C++ in the remaining time. All right. So there might be a lot. So the first one is my own question, actually. So I've known you a while from the committee, but I was actually always wondering, how did you actually get to work on Libs to C++? How does one become a standard library maintainer?
Starting point is 00:19:40 I just invited myself. I just started sending emails to the mailing lists and waited until other people retired and I was the only one left. It's the short answer. Yeah, I wanted to improve the documentation. And I think there was a push at the time to start using DocsGen and some of the old HTML Docs that we've written in the Doc Book XML format. The idea being that that's more structured and a better representation form for high-level documentation because it's a bit like LaTeX. You can create chapters and books
Starting point is 00:20:23 and different bibliographies and things like that, which HTML is great for presenting web pages, but has a less specialized structure for some of those sort of publishing things. Yeah. So I, and I wanted to improve those docs because they were a bit bare bones at the time I was trying to convince my manager at the time that using std vector instead of our own homegrown dynamic array thing would, would make sense.
Starting point is 00:20:47 And he was like, Oh, I don't know. I think the, the implementations are a bit immature and not quite ready for mainstream use. So I started improving the documentation to, you know, try and, try and demonstrate that this stuff was high quality and it was complete and all the member functions that you would expect there were there because the first C++98 standard was still quite new at the time. And then I just contributed more than docs and reporting bugs and bug fixes. And then I think one of the first major things I did was working to get the Boost SharePointer code contributed to GCC.
Starting point is 00:21:27 So our SharePointer is based on Peter Dimov and other Boost authors SharePointer. And I worked with some of those, including Peter, to get it contributed to GCC. And then we forked it and we've maintained it separately since then. But originally it was based on the Boost code. So I think that was one of my first major contributions. It wasn't even writing code. It was just, you know, coordinating the addition of some existing code. And I just stuck around long enough that I added more and more and other
Starting point is 00:21:58 people left through boredom or retirement. So it sounds like you've been doing this quite a while when you say the 98 standard was pretty new at the time. Yes. I left university in 99 and I don't know, probably within a year or two was itching to use what was in GCC at the time and demonstrate that it was solid and reliable to my, proved my manager wrong basically.
Starting point is 00:22:28 And it wasn't entirely reliable at the time, but it's got a lot more solid since then. So yeah, I've been contributing to Libster C++ for more than 20 years. Yeah, yeah, that's wow, that's impressive. You said you were the last one left at some point, I presume that was not literally true. There are other people still working on C++?
Starting point is 00:22:48 There was definitely... I'm not definitely interested. I think there was a time when I was the only person who was doing it as a job. There were some other volunteer contributors in the community, which is how I started before it became my job. But of the actual maintainers and people who can approve work from other people to be committed and that sort of thing, I was the only one who was still active. It's not entirely a joke and an exaggeration. Are you currently the only one who's active also still? No. So at the start of the year, there
Starting point is 00:23:26 were about 1 and 1 half people working on it employed by Red Hat. One does a couple of other works on the G++ side of it as well, not just the library. But since then, Red Hat have hired two more people to work on the library. So we've gone from 1 and 1 half to 3 and 1 half, which is quite a noticeable difference in terms of resources. But there are also other people
Starting point is 00:23:51 in the community, some who've been contributing for years, and then some who've just started recently. We're very grateful for their additions. The context for sorting stuff you mentioned earlier was done by Giuseppe D'Angelo. I think he's doing it on behalf of his employer, KDAB. I'm not sure how they pronounce that. So he's started contributing in the past seven or eight months or so, and added lots of good stuff, lots of the new C++ 26 proposals, some of his, some things that he's proposed
Starting point is 00:24:25 himself and got into the standard. And then he's also just picking up other things and implementing them. So he's a good example of someone who has become a Libster C++ contributor just by turning up and doing the work, just start sending emails and that's how you become a contributor. So I actually did want to ask you something else now, but this is just a perfect segue into one of the kind of questions I wanted to get to a bit later. But let me just do that now. Cause you already answered it so beautifully. So how can somebody actually get involved in contributing to let's see,
Starting point is 00:24:58 let's see what starts it sounds like you just, you know, get on the internet, get on the mailing list and start doing stuff. Is that pretty much it? That's exactly it. Yeah. Um, our patch submission and patch review process is like the Linux kernel. It's, it's very old school. It's all email based.
Starting point is 00:25:15 Um, so we don't have a GitHub that accepts pull requests or anything like that. This is not entirely a good thing, but that's the way it is. Um, I can say more about that if I have thoughts. But yeah, at the moment you just send emails to the mailing list. You don't even need to subscribe if you don't want to see everything else that's being discussed on the mailing lists. But yeah, just send an email with your proposed patch or say, hey, I'd like to start working on this.
Starting point is 00:25:45 And no one got any idea where I should, you know, what would be the right way to fix it. You can pick a bug. If there's something you've noticed that's not working or something you've seen in our bugs in the database or a missing feature, you know, something from C plus plus 26. That's not there yet. Then either send an email to announce you want to work on it or just start working on it and send us a patch. Just yesterday, I think, I pushed the first set of patches to implement MD-SPAN, which
Starting point is 00:26:15 is a new C++23 feature. And that was just a guy who emailed us three months ago or so and said, is anyone working on MD span? I think I'd like to. And we're like, cool, do it. So he started sending patches and getting reviews and after a few weeks of back and forth of, Hey, why don't you do it this way? It would, yeah, might optimize a bit better or this, this is a bit more maintainable or whatever suggestions.
Starting point is 00:26:43 And he revised his patches and send a new email with the updated patch. And then, yeah, it got to the point where we're like, cool, that's good. You've got, yeah, it's only one bit of it. It's just the stood extents helper class for now, but that, you know, everything else is built on out of that. So yeah, we just pushed the first bit of work from this guy who's just sent an email and said he wanted to start contributing. So can you share the email address? Yes, so you can find all the mailing lists on gcc.gnu.org.slash.lists.html.
Starting point is 00:27:22 Let me just check whether the.html is there or whether it's just a slash lists. All right, we will put the correct link in the show notes. Yeah, that has all the mailing lists. And to get in touch with us, it's libster2++ at gcc.gnu.org. But then if you want to submit patches, they need to be cc'd to gcc-patches at gcc.gnu.org. Because that's where most of the compiler work happens.
Starting point is 00:27:48 But that list is incredibly busy, there are hundreds of emails a day. So if you were just working on the standard library part, we have a separate mailing list. And we do send all the patches to the main patches list. But then I ignore them there. There's too much on that list. So I only read the Livestate C++ emails.
Starting point is 00:28:07 So make it sound really easy to get started. Presumably you need to do quite a lot of work just to even find your way around the code base. How accessible is that? How does it feel like getting started? So having worked on it for 23 years, it's hard for me to say. And having written a large chunk of it myself, I'm like, well, it's incredibly easy to understand. It's the finest code you of it myself, I'm like, well, it's incredibly easy to understand. It's the finest code you'll ever see, Phil. Why do you ask?
Starting point is 00:28:31 But I do like to think that we try not to make it too obfuscated. Tomasz Kaminski, who started, what is it now, May, two months ago, has commented that it took a little bit of poking around. But git grep exists, and you can find where something is. And file names, we try and split things up into headers that have names that correspond to the feature that's in the header. So for example, the vector header
Starting point is 00:29:02 is split up into three pieces for vector. There's the main vector header, and then that includes bits slash STL underscore vector dot H. So the sort of STL vector may not be particularly obvious when you first start, but git grep will help you find it. And if you just look in vector, you'll see that it includes that. So you can follow the code in a logical way. It's not too bad. And he commented that he found it mostly fairly straightforward, things were implemented the way he would have expected them to be. But this is an expert on the C++
Starting point is 00:29:37 standards committee. Maybe his expectations are unusual. So is it just the code and you work with that or are there any other resources available? Like some page somewhere that says, you know, here's, here's how we implement things or here's the macros that we use or things like that? Um, so I don't think we have too many weird macros. We try to avoid it. We have some for things like attributes or the constexpr keyword where it's not valid in C++98.
Starting point is 00:30:06 So we have a macro which expands to constexpr in C++11 and nothing in C++98. So we don't have to have if-defs. We just use this macro, and then it's either there or it's not there. And those should all be fairly self-explanatory. We have the DocBook XML manual that I described earlier is HTML pages are generated from that and
Starting point is 00:30:32 that's all online so you can browse through the user manual and that at the end of it has a big section on contributing which has our directory layout and how the test suite works and what the coding conventions are for, you know, placement and important things like that. Oh, of course. You have to maintain a backwards compatibility between the CSS 98, right? Does that mean you have to implement everything in CSS 98, except like bits that are like, there's a little bit of new stuff here and there, but most of it is C++98? Yes. Oh, interesting. So std expected doesn't have to compile for anything earlier than 23, because it was added
Starting point is 00:31:15 in C++23. But most of the code in vector has to compile a C++98 because people might for some reason still be using GCD 15 and using the dash STD equals C++98 flag. If it's a new constructor like the from range constructors for vector, obviously they can use concepts in C++20. But then if they call into other parts of Vector, those parts still have to compile as 98. Yeah, there's less and less now. We add more and more stuff that isn't needed for 98.
Starting point is 00:31:56 And a lot of the stuff that is needed for 98 is fairly stable, and we don't have to go and poke at it often. But yeah, there are large parts of the library that do still compile as 98. And they might have some if-defs there for, oh, we can't use if constexpr in 98, so we use tag dispatching. And we used to implement everything so that it would work for 98.
Starting point is 00:32:18 More and more now, I'll use an if-def to say, let's avoid tag dispatching to a bunch of overloads that have a true type and false type as the tag to distinguish them. And we'll just use if constexpr because it compiles faster. It avoids a function call. There's less debug info. It's only good. So rather than saying, no, we just compile everything as the lowest standard that it needs to. 98 is so old that we do sometimes have two implementations of the hash if. I was going to say, yeah. One way or the other.
Starting point is 00:32:56 Just fork it with an ifdef. So if the 98 version really changes, all the new developments going on, yeah, makes sense. Yeah. So libc++, to take a small diversion, have gone more extreme in that direction. And they have forked C++11. And they've duplicated all the headers. So rather than using if-defs within the header, fork the code into two implementations,
Starting point is 00:33:21 they've now got two headers. And when you include vector, it will say, right, are we, do we want the old vector or the new vector? And you get a completely separate header. And the idea there is to basically freeze the old one and say, we are not even going to touch this anymore unless there are critical bugs. And that's a really interesting direction there. They're literally cloning the, almost the entire library or everything that was valid
Starting point is 00:33:44 in 11. Yeah. I'm saying we don't touch this anymore. It's in a separate header. So I was just curious, is there anything else where like working on the standard library is just really different from working on like whatever in-house library you're working like anybody else would be working on? The main thing you see first is probably just the naming conventions with underscores everywhere, double underscores or underscore in a capital letter, which those names are reserved for the
Starting point is 00:34:14 implementation, which means that user code can't use them and therefore we can use them. And when you first start looking at that code, you're like, why is this so ugly? Why is everything written in this way? But you pretty quickly stop seeing them. I don't see the underscores anymore. They are there for an important reason. Unfortunately, because of the preprocessor, which, you know, lots of C++ programmers have bad things to say about it.
Starting point is 00:34:42 If we had an internal member function of vector called m underscore insert, which was used for the implementation of vector insert, lowercase m underscore insert is a valid macro name for your code, for user code. So you could define that name to be anything like the literal one or something. And then if we use that inside the standard library, after you've defined it, it that you can have all the names that start with underscores, do whatever you want with them. And we have to kind of stay out of the way of anything that you might have ever defined as a macro. So even local variables, for example pre-processer will just come along and smash them. Yeah. So every variable and every member function name and every class name, an internal helper class, basically every name that isn't in the standard.
Starting point is 00:35:53 So vector and insert and pushback and in place. Words like this, you can't define it as a macro, because obviously you'd break the public API if you did that. But everything else, like m underscore insert, no, we can't use that. Or do insert using fast path or something. No, we can't use that. We have to call it underscore underscore do insert using fast path. And that's true for function parameter names, local variables, everything that's not part of the standard API. And that takes some real getting used to when you start looking at it. Like, why is it like this? And then you also,
Starting point is 00:36:32 you have to remember to put those underscores there. So, you know, new contributors often, they get it right for most of it. And then they'll just miss one place and it'll be like, ah, you missed two underscores here. But that's the sort of obvious thing you realize on day one when you start working with the code. There are other, lots of other constraints as well that may not, they're not as glaringly visible to you. So backwards compatibility with 98 or older standards is a big one.
Starting point is 00:36:59 ABI compatibility is a big thing that unless you have, unless you're aware of all the things that need to remain compatible and how it needs to remain compatible, it's very easy to say, oh, we'll just add an extra member to this class. We'll just refactor these two functions so that instead of this one calling this, they change the order that they call each other and simple things like that, that you might just do it to your own, you know, your, your code base at work because you control everything and you're going to recompile it or might not be possible in, in code where you've no idea what people are doing to rely on this code or how, you know, how they expect it to remain compatible. So since you brought up API compatibility, I think it's worth digging into that.
Starting point is 00:37:46 I bet that's quite a, quite a big topic when it comes to using and maintaining the standard library. So what's really the issue here? Why is it you have to be so careful about API? Um, when we release a new version of GCC, like GCC 15 a couple of weeks ago, GCC, like GCC 15 a couple of weeks ago. And people install that the standard library, uh, DSO, the dynamic shared object, uh, comes as part of GCC 15. And if you had compiled some of your code last year with GCC 14,
Starting point is 00:38:20 what we guarantee to the very best of our ability is that you can just install the Libster T++.so.6.34 or whatever it is, the new one from GCC 15, and all your existing binaries that dynamically linked to Libster T++ can now use the new one and work exactly as they used to. All the observable behavior will be the same. Hopefully it will be a bit faster, or we might use a bit less memory internally, but the contracts, the preconditions, and the post-conditions of all the symbols in that library have to remain the same. So we can't, for example, add or remove functions that were exported from that shared library if any user code could have linked to them. So this is interesting.
Starting point is 00:39:16 So I do have a question. To what extent does that actually work? Like both in terms of like how far back and in terms of like how many cases are there where, you know, you do have to break the ABI. I know that there was this famous ABI break, I guess, over a decade ago now, where you changed like the string implementation to no longer do a copy on write. We didn't break it at all. The old string ABI is still present in the shared library.
Starting point is 00:39:45 the old string ABI is still present in the shared library. So we support ABI compatibility back to GCC 3.4, which is now 3.4, then we went to 4.0. It's about 20 releases ago, something like that. So about 20 years ago, I think, something approximately. And you could, in theory, take binaries that were compiled with GCD 3.4, and they should run correctly with the libcd++.so from GCD 15. And it does work. And that's kind of Red Hat's business, and Red Hat Enterprise Linux ensures that that works.
Starting point is 00:40:27 And that's why Red Hat pay people to work on Libster C++ and make sure that it doesn't just get broken in ways that would mess up their customers' business. The std string ABI transition, as we refer to it, added a new std string and kept the old one there. So there are two strings in libstdc++, which you can select with a macro on the command line. And all the symbols inside the library that use std strings, like locals and things like that, are duplicated so that whichever std string definition your object files
Starting point is 00:41:06 expect to find, they're both present in the same shared library. So you don't need to say, oh, I compiled this piece of code with the old thing, so I need the right library that goes with that. There's one library which works with both. And in theory, you can even link together different, and not just in theory, it works again.
Starting point is 00:41:24 You can link two objects that disagree on the std string that they see when they were compiled and they both link into the same executable and both work. You can't pass the strings between them because they're not the same string. They have a different mangled name. So that's interesting. Let me just quote from the email we received from Abe Mishler a few weeks ago. So Abe is saying, I have noticed that some code compiled with G++ on Ubuntu 23.10 won't run on certain targets, but code compiled with the same version of G++ on Ubuntu 22.4 will. So is that something else going on here? Then ABI... Yeah, so that's probably glibc dependencies. That's the C library.
Starting point is 00:42:11 Right. All right. Are you working on that too, or is that a different team, different people? That's a different team in our... Well, a different set of people in our team at Red Hat. And obviously it's an open source project, so it's not all Red Hat. There are other people contributing to it. When you compile your code and when GCC is compiled on Ubuntu 23, 04 or whatever, it can make use of APIs and symbols in the C library that are present on the machine where you compile it. And it then has a dependency on a version of glibc
Starting point is 00:42:49 that provides those symbols, the version that it was linked against. If you then take that binary to a different machine that has an older version of glibc, it can't find the APIs or the versions of the symbols that it needs in the older glibc. Oh, so are we just talking about basically features here that just don't exist on those older versions? It might be a... it's not necessarily features that don't exist, but say if mem copy is... so this is a real example that happened. You know how mem move supports overlapping regions and mem copy doesn't?
Starting point is 00:43:28 And at one point, G-Lib C's mem copy was just the same as mem move. It was like it handled overlapping ones. But then someone said, hey, we could optimize mem copy if we don't have this check here or if we assume no overlapping, then memcpy can be slightly faster. And so glibc optimized memcpy in that way because the standard says they can. And it broke people. Some people were using memcpy on overlapping regions as though they should have been using memmoo, but they weren't. And so their binaries would not work with a newer glibc. So the solution is that glibc adds a version number to each symbol. So when you link to the older glibc, you didn't just link to memcpy, you linked
Starting point is 00:44:12 to memcpy at seven or, you know, some, some glibc underscore something. And then when you link, if you link against the newer glibc, you linked to memcpy at eight. Literally the at symbol is in the mangled name, the symbol name. And the newer glibc contains both memcpy7 and memcpy8. So it has two implementations of memcpy. And if you linked against the older one, and you were possibly relying on the semantics of the older one, that's what you get when you linked to the new glibc object. It has both of them there.
Starting point is 00:44:46 It goes, which memcpy do you want? And if you linked against the old one, you get the old one, and it might be a little bit slower. And if you want the faster memcpy, you can recompile and relink, and then you'll rely on memcpy at 8 or whatever the number was. But then if you link with the newer one, you have a dependency on this at eight symbol. If you then try and run that on an old machine that doesn't have the new G-Lib C that symbol just isn't there at all.
Starting point is 00:45:13 It won't even, it can't run because there's no memcpy eight present. Yeah. Yeah. That makes a lot of sense. So it seems like it's actually a bit similar on, on macOS, right? Where I use, when you compile, you specify your deployment target, like down to which version of Mac OS your binary is supposed to run. And then if you're trying to run it on an older version where the
Starting point is 00:45:34 dynamic standard library doesn't have that feature, it just won't run. But, or if you select the lower one. And Mac has a very cool way of selecting that. You can, when you're building, you can say, I want to target this, and you get a totally different SDK. And that way you can't accidentally depend on anything newer. So you get a compiler error if you're trying to use something newer there on Mac. Yeah. And what gets compiled will only depend on, even if you build it on the very latest Mac OS, you can build things that don't depend on the stuff that's on the current system.
Starting point is 00:46:07 That doesn't really happen with Linux. We could do that. And we've talked, Red Hat, about having like an SDK where you'd say, I want to target VEL 8, even if I build on VEL 9. Yeah. We haven't done it. Mac does it this way. Android does it this way. Actually, Windows, I think, is, yeah. We haven't done it. Mac does it this way. Android does it this way. Actually, Windows, I think, is yet different.
Starting point is 00:46:28 Windows, Microsoft, I think their thing is just backwards compatibility as much as possible. So what they do, I believe, is if you heard a dependency to a newer standard library, and then you try to run it on an older Windows, it will actually say, I need to download this DLL now. It will actually download that for you and everything will just work. Yeah. So they effectively deploy the new light standard library runtimes on older systems. Yeah. Yeah. So that's yet another approach, right? Yeah. If you're, if, was it Abe you said, if they were to take the new Libsyt C++.so
Starting point is 00:47:05 and install it alongside their application, it would now work on the older. Sorry, they'd also have to do it with G-LibsY because it wasn't exactly the same. Oh, can you do that? Can you actually manually do that? I mean, it's hard and it's not supported. And we say, no, don't do that.
Starting point is 00:47:20 The supported model we recommend, instead of the Apple target this SDK, we say you should just build on the oldest system that you need to deploy to. And with containers and VMs and things, it's super easy. Just keep around a container for Red Hat Enterprise 7 or CentOS 7 or Ubuntu 18.04 or whatever it is that is the oldest target you have, build your
Starting point is 00:47:47 binaries on there and then you know that they don't use anything newer. But they will run on things that are newer because, like I said, Libsat 2++ guarantees that you can still take your binaries compiled 10, 15 years ago and the new library will use them, you can't do it the other way. You can't take things compiled on a new Ubuntu and run it on an old Ubuntu. So I guess Abe's point was why does Windows do that but Linux doesn't? Why is there such different politics there? I don't know. It's a different deployment model, a different support model. Generally with Linux, historically, you've been able to recompile everything that you wanted to depend on. And so if you can't run something, either you get all your binaries from the distribution,
Starting point is 00:48:38 you know, from Red Hat or Fedora or Ubuntu, and they all match. You know, they all use the same version of the runtime because everything was built for that version of the OS. So everything's consistent. It all just works. Or you are a third party software vendor who's targeting a particular version and then you build for that version as well. And if you need to run stuff on an older one, you could download the source, assuming everything you're using
Starting point is 00:49:06 on Linux is open source, which obviously isn't the case. You could download the source and build it for your old system. The problem is when you try and mix this distribution model that assumes everything is built for the current target, it's built to run on the machine that you're running it on, and you mix that with a binary software distribution model where it doesn't, it assumes you can build it once and run it anywhere. And they're just two different worlds. And Windows has always been about supporting OEMs and third party software
Starting point is 00:49:39 distributors who want to build once and run it on any Windows and just have it work. who want to build once and run it on any windows and just have it work. Whereas there are, you know, business Linux distributions to call the enterprise displays that, but a lot of it is a very different world where people are not interested in helping some closed source proprietary vendor distribute stuff. For Red Hat's purposes, we certainly want that. We want to be able to, you know, we want our partners to be able to build software that then runs on our OS. And we don't want to say to them like, oh, no, you just have to give your users the source and they'll build it themselves, you know, make everyone build the Oracle database themselves.
Starting point is 00:50:32 But we solve that by providing a fixed target, like CentOS and Red Hat provide a fixed, you know, oh yeah, you just build on this system. But you can't then move those binaries to Ubuntu because that has a different ecosystem with different set of libraries and different naming conventions for the shared libraries. It's all much more splintered than Windows is. So, so ABIs and like dynamic library is one aspect. Another aspect is what happens when something's in a header. So we actually had another standard library maintainer, Louis Dion, who's working on libc++ on the show just a few episodes ago. The, the episode was a little bit more about something more specific, we were talking about, like library
Starting point is 00:51:07 hardening and how they do it there. But like one of the things we discussed there was how they all the tricks they have to do if you have like an inline function or an header. And then you know, you have a hardened version of that in one to you and non hardened version of that and another to you. And it's not in a dynamic part, because it's like just just in the header, right? It's like included. And then and then you basically have an ODR violation and how they jump through all of these hoops to like, make sure like your code doesn't break. So So how do you deal with stuff like that? Where like the actual code in
Starting point is 00:51:40 the standard header is different from version to version and somebody uses one year and another there and links it together and then you have an ODR violation. Like, do you have anything like that going on? So there were ODR violations and there were ODR violations. That's more than one definition of ODR. Yeah. Yeah. If you have an inline function and you compile an object that inlines it, and then somebody adds one line to that inline function, like they add a log line to the middle of it, and you compile another object that
Starting point is 00:52:18 uses the new version as the line in the middle that does the logging, and you link those two objects together, technically you've got an ODR violation. But in practice, who cares? You know, that the addition of that log line is not going to wipe your hard drive and set fire to your cat. It's, you know, the standard says you can't do that. It's undefined, but that's because we can't define what happens if you do that.
Starting point is 00:52:46 The standard cannot guarantee that if inline functions are inconsistent, that your program will figure it out. Well, you could say that you're guaranteed to get exactly one of those different definitions, right? But instead we say you can get nasal demons. Yeah. Yeah. But you don't necessarily get exactly one because if the functions are inlined, then one object has the old behavior and one object has the new behavior. We can't guarantee that they both, unless we never allow inlining. If we globally disable function inlining, then yes, we can guarantee that you get one or the other.
Starting point is 00:53:24 And depending on the linker order, we could even guarantee which you get. But as soon as you allow that code to be inlined, it's no longer possible to say you'll get one or the other. You get both. One object gets one, and the other object gets the other. They both coexist. But it's harmless, really.
Starting point is 00:53:41 All that means is that one of them doesn't do that logging. But it's perfectly easily explainable. It's like, why does this definition not have the log line in the middle? Because it didn't have it when you compiled it. So it doesn't sound like this is a problem that you have to deal with regularly? What you can't do is change, in terms of contracts, to be languages that you're intimately familiar with, Timur. You can't change the preconditions and the post conditions of a function that's in a header such that, for example, if we have two functions where one calls another and originally the first function checked some preconditions and then called the second one and made sure they were valid and either threw or called the second function.
Starting point is 00:54:30 And then we decide to move those checks into the second function so the first function doesn't do them. That's possibly an ABI break because now, if you get the new version of the first function, which doesn't do the checks, and the old version of the second function, which doesn't do the checks, you now don't do the checks. Yeah, I mean, this is exactly the context. You moved checks across an API boundary and an ABI boundary, and that isn't allowed.
Starting point is 00:55:01 That is an ODI violation where you get a real problem, because suddenly the definitions no longer have the same semantics. But if you had two versions of it, and as long as they're sort of API boundaries compatible, whether one of them writes to a log in the middle or has an assertion in the middle, they're still the same logic, even if
Starting point is 00:55:26 they're not the exact same token sequence. And I think so, so Louis has gone to great lengths to make sure that if you turn on assertions in one of them and not in the other, that you get predictable behavior. One of them will have the assertions. We don't do that. So we do have, we are a hardened implementation. We support all the same assertions and more that C++26 will require. But if you only build half your application with it, it is possible that you get an ODR violation of whether the assertions are on or not. And depending which definition the linker gives you, they may be on or off and that might surprise you.
Starting point is 00:56:05 We're taking a more pragmatic and easier approach of saying, well, if you need them to be guaranteed to be on, turn them on everywhere. Don't link inconsistent objects. Yeah, that makes sense. Yeah, thanks for explaining. That's exactly the context in which it came up with Lui as well. So we really care about some ODR violations where it's like,
Starting point is 00:56:26 and this has happened in the past where we actually released something and broke it where I think it's one of the sorting algorithms where we moved some logic checks across from one function to the one it calls. And it broke code because now nothing was doing the check and we had to revert that change. So it's not theoretical. And so those cases we read.
Starting point is 00:56:48 And that's when the examples I was talking about with new contributors, those things are not necessarily obvious. You look at the code, and you're like, oh, I can just move this here and refactor this. And somebody's object has already compiled this bit of function, and it's instantiated in their object file. we need to be compatible with that, you know, the contracts of
Starting point is 00:57:10 that existing symbol, that stuff's hard and not obvious when you look at the code necessarily. There's a lot more we wanted to talk about. We are running up to, well, according to my clock, overtime already, again. One more question for you before we do wrap up though. What are you actually working on right now? Is exciting stuff coming? We are trying to stabilize C++ 20 support in the library. Although all the features are there, they're not all right yet and we haven't committed
Starting point is 00:57:46 to the ABI for them. So in the release notes that you mentioned earlier that you're going to put in the show notes, I think, we clearly state that our C++ 23 and 26 support that we've added flat path and this stuff that is experimental. That means that we both GCC and the Linux distros like Red Hat and others are not yet committing to that ABI. So when I said you can compile stuff with GCC 3.4 and it'll still run now, that's for the supported standard versions. And unfortunately at the moment C++17 is our newest supported one. So we're trying to get C++ 20 to that state where
Starting point is 00:58:25 if you compile C++ 20 code with GCC 16, we will endeavor to ensure that it's still compatible with GCC 20 and 25 and forever, until I retire maybe. But yeah, at the moment, our C++ 20 support is sort of, we're not guaranteeing we won't break it yet. And obviously we're quite late delivering C++ 20. So that's our priority at the moment. Well, it's only 20, 25 as we record this.
Starting point is 00:58:54 So we're not too bad. If we get it out before C++ 26 is published, then I'll be happy with that. But that is a massive problem that the standard is adding stuff at a rate that the implementers can't necessarily keep up with. Oh, we've got Hazard Pointer and RCU in C++ 26. Yeah, good luck with that. I don't know when you'll be able to use them.
Starting point is 00:59:16 Yes, hazardous indeed. All right, well, anything else in the world of C++ that you find particularly interesting or exciting? I don't really get to look at anything else in the C++ world because I've got my head deep in the implementation. I'm excited about reflection. Of course. Yeah. Yeah. I think that's pretty cool. If you listen to every episode over the last year or two, I think that's 90% of the responses.
Starting point is 00:59:44 Right. So before you wrap this up, I just want's 90% of the responses. Right. So before you wrap this up, I just want to say one more thing, like you and other people who maintain and develop, you know, standard library implementations, whether it's yours or Clang's or Microsoft's. Like this is technology that everybody who, you know, writes C++ is just using every day and we kind of just take it for granted that all of that stuff just works. But, you know, it's obviously a kind of just take it for granted that all of that stuff just works. But, you know, it's obviously a lot of hard work to actually make that happen.
Starting point is 01:00:10 As we heard a little bit about that today. So I just want to say huge thank you to everything you do. Like we all just rely on this work, but it's obviously very, very important work. So yeah, thank you so much for everything you do. Thank you. Yeah. It's fun, but it is hard. you do. Thank you. Thank you.
Starting point is 01:00:23 Yeah. It's fun, but it is hard. But you can always persuade your listeners to send us patches or bug reports or even just documentation. It's where I started. Everyone will be grateful for those contributions because, like you say, everyone uses it. Those links and email addresses will be in the show notes. So yeah, everybody feel free to reach out and support GCC.
Starting point is 01:00:50 So apart from that, anything else you want to tell us before we let you go, including where people might be able to reach you, I presume just those GCC mailing list. Yeah, my email address is pretty easy to find on those mailing lists. Yeah, my email address is pretty easy to find on those mailing lists. But I'm usually pretty visible on Reddit and Stack Overflow and mailing lists for GCC and the roundabout. Some people say it's hard to miss me online in the C++ world. But yeah, the GCC mailing lists are a good way to reach out to me. All right. Well, thank you again, Jonathan, for taking the time to be our guest today. It was a very, very interesting discussion. So thanks a lot. And yeah, thank you everybody
Starting point is 01:01:29 else for listening. Thank you. Bye bye. Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff that you're interested in, or if you have a suggestion for a topic we'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate it if you can follow cppcast at cppcast on X or at mastodon at cppcast.com on mastodon and leave us a review on iTunes. You can find all of that info and the show notes on the podcast website at cppcast.com. The theme music for this episode was provided by podcastthemes.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.