CppCast - San Diego EWGI Trip Report
Episode Date: December 13, 2018Rob and Jason are joined by JF Bastien from Apple to discuss the San Diego C++ Committee meeting from his perspective as the chair of the new Evolution Working Group Incubator. JF Bastien is t...he C++ lead for Apple's clang front-end, where he focuses on new language features, security, and optimizations. He’s an active participant in the C++ standards committee, where he chairs the Language Evolution Working Group Incubator (“oogie” for short). He previously worked on WebKit’s JavaScriptCore Just-in-Time compiler, on Chrome’s Portable Native Client, on a CPU's dynamic binary translator, and on flight simulators. News Exploring C++20 Designated initialisers std::embed for the poor (C++17) RangeOf: A better span JF Bastien @jfbastien Links 2018 San Diego ISO C++ Committee Trip Report C++ Current Status Sponsors Download PVS-Studio Technologies used in the PVS-Studio code analyzer for finding bugs and potential vulnerabilities JetBrains Hosts @robwirving @lefticus
Transcript
Discussion (0)
Thank you. and C sharp. PVS studio team will also release a version of sports analysis of programs written in Java.
And by jet brains,
maker of intelligent development tools to simplify your challenging tasks and
automate the routine ones.
Jet brains is offering a 25% discount for an individual license on the C plus
plus tool of your choice.
See lion resharper C plus plus or app code.
Use the coupon code jet brains for CINS for CppCast during checkout at
JetBrains.com. In this episode, we discuss designated initializers and stood-in-bed
alternatives.
Then we talked to Jeff Bastian from Apple.
Jeff talks to us about the San Diego C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how's it going?
Pretty good, Rob. How are you doing?
Doing okay. Doing great since I saw you two hours ago.
Yes, it's a bit of an inside joke for our listeners, I guess.
But we're recording two episodes back-to-back for the first time ever.
Yes, we've sometimes done two in a week, but we're doing two in a day just for scheduling reasons yes yeah so this is the disembodied voice of the
past or something like that right well anyway uh at the top of episode like three to piece of
feedback uh we got a itunes comment here and it's a, it says great hosts and always informative for this from plain Jim.
And he says,
this is a wonderful podcast that helps me learn about important people in the
C++ and software community.
It's not dry at all.
And I really enjoy the kind and professional hosts who make it very special.
So thank you,
Jim.
Yeah.
Very nice.
It's not dry at all.
Yeah.
Well,
I can see how some programming podcasts could be quite dry. Yeah. Yeah. Well, I can see how some programming podcasts could be quite dry.
Yeah.
Yeah.
No, I mean, I would expect ours sometimes is, honestly.
But it's good he doesn't think it is.
Yeah.
Well, thank you for the kind words, Jim.
We'd love to hear your thoughts about the show.
You can always reach out to us on Facebook, Twitter, or email us at feedback at cpcast.com.
And don't forget to leave us reviews on iTunes.
Joining us today is JF Bastion.
JF is the C++ lead for Apple's Clang front end,
where he focuses on new language features, security, and optimizations.
He's an active participant in the C++ Stands Committee,
where he chairs the Language Evolution Working Group,
Incubator, Oogie for short.
He previously worked on WebKit'savascript core just-in-time
compiler on chrome's portable native client on a cpu's dynamic binary translator and on flight
simulators jf welcome back to the show hey thanks for having me again you know i yeah this is your
second time right but i can't wait a while yes it's been a long time like three years or something
right yeah something like that. Maybe, yeah.
I think last time we talked about WebAssembly,
shortly after it was announced by the four browser vendors.
Yes.
And now that's a real thing.
That's in the real world doing things.
Yeah, it is.
It seems to be getting a lot of traction among just web developers in general.
There's still a lot of ongoing work,
but the basics are there and seem to be working really well.
I just saw a comment on Reddit from someone that said
that they are in the process of porting some C++ code to TypeScript
and was looking for some guidance.
And I have respect for TypeScript,
but my immediate response was, why don't you just
compile that C++ for your web browser? Yeah, that's certainly easy. It's certainly the easiest
approach. But in a lot of cases, if you do something that's native to the platform,
that's well supported, you can trim down code a lot and have kind of faster execution.
Not that WebAssembly can't get that,
but there is a cost to the bridging,
so you kind of have to think about
the way you're going to do that bridging.
Right, so if you're having to do lots of bridging.
Okay.
Right, exactly.
And so there's a lot of upsides.
If you're trying to do compute-intensive stuff,
WebAssembly is really good,
but if you're going to try to get, say, data in and out of the WebAssembly sandbox or something
like that, then your C++ code might not be the bottleneck.
Right.
In some cases, it makes sense.
In some cases, just TypeScript or JavaScript are really good solutions.
That's what the web is, right?
They do it really well.
Right. It's really a question of what's the better tool for the job yeah i would be well
i mean it's just a random comment on reddit but it would be curious to have a full conversation
with this person and see what they had uh what they had attempted so far and checked out and
everything yeah yeah okay well jay if we got a couple of news articles to discuss, and then we want to talk
more about your own thoughts out of the San Diego C++ meeting and the work you did as the Oogie
chair. Okay. Sounds great. Okay. So this first post is on Therocode's blog, and it's exploring
C++20 designated initializers. And it is a really good post going into kind of the history
of aggregate initialization and why designated initializers are going to be a great feature for
c++ 20. This kind of seemed like it was a long time coming I don't know when it made it into c
but a while ago right c 11 or before. All right this is a feature that came from C, right? Yeah. Right, it is. The difficulty
though is C++
has constructors and so designated
initializers just based
on that have a few issues. If you can
reorder the developments or something
like that, it's the same as the
constructor where you use the colon
and you have the list of initializations.
If you don't put it in the same order as it was
declared in struct, then they're not necessarily obvious rules on what gets initialized when.
And so I think that was one of the biggest holdbacks from getting it into C++, just kind
of getting some sane rules in there.
And otherwise, the C rules are kind of weird in some cases.
And so the committee really wanted to not just adopt the C++
as is, but have something that's
compatible yet fits well with
C++, I think.
And it's ultimately not going to be fully compatible,
right? Because it will allow
braced initialization of
the named
designated initializers,
which C will never
allow. Is that correct?
Well, I don't know if they'll never allow it.
Well, no, I mean, right, but it would be incompatible today.
Yeah, certainly there's a common subset that will work in both,
and then there's things both in C and C++ that won't interoperate at all.
Okay, right.
But I think we got a pretty good design out of that feature,
so I'm pretty excited.
No, I'm pretty happy with it.
The only thing in my playing with it,
because as far as I know,
no compiler currently supports
the braced initialization syntax.
They all basically just, as far as I can tell,
for the most part,
enabled what they had from the C side of things
at the moment.
Right.
And some of the things that I want to know
is how does this interact
with
deduction guides?
That kind of thing.
Because you can kind of
if you squint your eyes
synthesize a constructor
by using a deduction guide
and I want this to play nicely with deduction guides
but it wasn't clear to me in reading it
and playing with it how well it would.
Yeah. Yeah, I don't know.
Okay.
I have no idea.
Okay.
So the next post we have is stood in bed for the poor C++17 users
or cross-platform resource storage inside the executable.
And this reference is-in-bed,
which we talked about with Jean-Meneed a while back.
Doesn't look like stood-in-bed is going to make it into C++20.
So this is kind of an alternative tool that you can use for similar functionality.
And the post kind of dives into how he goes about doing it.
Yeah, it makes a about doing it. Yeah.
Yeah.
It makes a lot of sense.
Yeah.
I didn't want to joke around that.
Yeah,
totally.
Go ahead.
Go ahead,
Jason.
The Reddit,
specifically the Reddit post about this article,
John,
John,
we need,
he, he comments, like has a discussion with the author of this article, Jean-Meneed, he comments, like,
has a discussion with the author of this article
about the pros and cons of the different options
for how he implemented this.
So I would suggest that for our listeners
who want some more information
on the background and history of these options.
Right, and I expect that we'll see...
So we saw...
I don't want to give too much away,
but we saw Stood in Bed in San Diego, and I expect that we'll see... So we saw, I don't want to give too much away, but we saw std embed in San Diego,
and I expect we'll see an update
in the next upcoming meeting in Kona.
And I think there's a pretty big discussion
going on inside the committee
about stuff related to that in general.
So I'm pretty hopeful that we'll get something
in that direction.
I'm just very worried about the build system implications, right?
If you can pass a constexpr string to std embed,
that like at compile time,
you figure out which file to include,
then your system has to run all of constexpr
before being able to determine the dependencies.
Now, it's kind of a niche feature in my opinion.
So I'd be really happy to not have that part of the feature
yet have the other type of dependencies, right?
Especially with modules, like you kind of want to have really quick, easy parsing of the module,
and student embed kind of defeats that, whereas modules tries to give you,
kind of make your build system easier to support in some senses.
So I think there's a balance to be struck, and it's not clear where the committee wants to land on that.
So you're saying, if I understand correctly, a string literal is fine,
but if it's any constant expression, that becomes more difficult.
Well, so the idea is you would say student-bed foo,
where foo has been derived from playing an Amiga emulator as a constexpr thing.
As a hypothetical possibility.
As a hypothetical thing.
I'd be worried about the compile time
aspect of that, of figuring out what the
dependency was. I'd like
to construct a graph really
quickly by quickly pre-processing
the top
of every file. And then when you
see the keyword
module or something like that, you know you're done
with pre-processing
the module.
It's in a perfect, if
modules are in C++20 world,
your build system should be able to figure out
file dependencies really quickly in an
ideal world. And I'm worried that
std embed would defeat that. But at the same
time, I want std embed.
I've done
stuff similar to this blog post before
and so it's a really interesting feature to have
to be able to say take this file and constexpr create
this result from the file
so yeah I kind of want both
yeah the project I'm currently working with a client on
has been dealing with some version
of the solution I think for
six years or so,
rolling their own CMake kinds of things
for trying to make effectively a single binary
that has everything it needs in it.
That's really our goal.
We don't need the data at compile time,
but we definitely need that data in the binary
and access to it at runtime.
And that aspect of this i think
is yeah highly necessary the compile time part is fun because i like doing compile time things
right okay and then the next post we have is a range of a better span and this is a follow-up
post where the author uh just talked about why he doesn like span. I didn't read the first post.
Did you, Jason?
I did not.
Okay.
But this one just goes over kind of refreshing why he doesn't like span
and this alternative solution with using the ranges, which are going to be in C++20.
Yes. Yes.
Yeah.
The committee is pretty split on that in general, on how span should look, whether its size
is signed or unsigned, and other stuff around that, how comparisons work or don't, whether
the type should be regular or not.
So there's kind of a lot of different trade-offs in that design space and people who are much better than me at the library side of things are having pretty,
pretty serious discussions about this and, uh, content is part of that.
So it's, it's been good to see people try to find an agreeable solution.
I'll be curious how it all shakes out.
I don't feel like I personally have, I don't know, a horse in this race or whatever, like
whatever the committee gives me, I'll be fine with. I have to admit though, when I was reading this article, I got
distracted by something right towards the middle where he says, but in general, I think we should
try to move away from the header source file separation model. And that's a link to a separate
article that he had written a month and a half ago.
And it's,
uh,
seems to be effectively an argument for header files,
like no,
you know,
just putting everything online,
everything in line all the time.
And I know organizations who are doing this,
right?
We,
we discussed,
um,
we discussed that with Lenny two episodes ago.
Yeah.
And I was just kind of curious if JF had seen this
or if he had an immediate gut reaction to it.
Yeah, well, I mean, especially in the, you know,
like the Free C++ 11 times,
like we used to do that for header-only stuff everywhere,
kind of like what Boost does, right?
And I've kind of gone back and forth on things like that.
And in a way, I find it hard to navigate when there's all this code in there, right?
I kind of like the...
It's not just like you abstract away the class, but it's also when, as a human, you read the
code.
I like having the abstraction of the header and the CPP file or something like that.
But at the same time, you could argue that's the job of the IDE to kind of hide things from you unless you actually want to see them.
And again, I kind of go back and forth between using an IDE and not.
So I use Xcode fairly extensively, except when I use Emacs.
And there's no logic to my usage.
I just kind of go back and forth and try to deal with the two of them really well.
I think modern codebases care about compile time a lot,
and so they don't tend to put everything in the header file that much.
But I guess we'll see what happens with modules, right?
How that changes the landscape,
how IDEs make your life easier there or not.
So, JF, do you want to start off by just telling us a little bit about the new position you
had in San Diego as the Oogie chair?
Right.
Yeah.
So, Herb sent out an email or a blog post, I guess.
Well, first he sent an email to the committee, then he posted a blog post saying like, we've
got a huge influx of papers coming in. And it's kind of interesting because Bjarne
had kind of predicted it in a way. And everyone was like, nah, you're just worried that there
seems to be a trend, but that'll die down. And it turns out Bjarne was really right.
So for San Diego, there was an influx of papers papers I think it was 276 or something like that where that
that's more than twice that there's ever been in a pre-meeting mailing and so
that's you know that's not possible to read all the papers I think it's he
calculated it's more than three times the works of Shakespeare or something
like that again it doesn't make sense, right?
And at the same time, there's an influx of new participants.
So I think there were 180 people who showed up in San Diego.
Now, some of it is historically there's been an influx just before a standard kind of gets nailed down, right?
So before it's kind of finalized.
And so some of it is trying to get last-minute features in
or last-minute fixes in so that the standard is better.
And you definitely want to prioritize those things.
At the same time, there's features that you don't want to rush into the standard.
So you want to say, well, let's take a step back.
And this feature could make it to 20, but it's not a good idea.
We're not confident about the quality of the feature yet, even though it might seem good.
So there's kind of a balance to be struck.
And also because we have a lot of new people coming in,
it's hard to know how to write a paper that does the right thing.
There's a lot of unwritten rules about how the committee works
that people have to kind of just figure out.
And we've been trying to get better at documenting those things, but it's just a fact that the committee is not that great at documenting
those things yet. So you'll write a paper and you'll think like, I got this, it's really great.
And then you'll realize, oh, there's all these things I didn't follow. And that's true both on
the language side and on the library side. And so I think the goal that Herb had was to really help new people write higher quality
papers by having an incubator that tries to work on the features in the smaller group,
such that when it gets to the library group or language group, the feature is way more polished.
You don't waste as many people's time trying to polish
something, right? So that's one aspect. Another aspect was to just say no to features that
shouldn't go forward as early as we can with as few people involved as necessary, right?
It's not really saying no, but it's more figuring out why
something fits or doesn't fit and what could be done to make it fit, right? Sometimes you're
saying no, because you'd like to see the feature done a different way. In a lot of cases, it's just,
well, this is nice. We could create a macro for this, but really we should work on constexpr to
make it happen instead or on reflection to make it happen instead. So that's another of of trying to redirect the feature of like well you want to do this but have
you thought about doing it this other way and then it's trying to merge features as well we had you
know two proposals for pattern matching which is a huge language change and we don't need two
proposals for pattern matching right it's it's good to get to a certain point where you can evolve proposals separately,
but at a point in time, you've got to choose one.
So the goal in the incubator is to kind of try to see trade-offs
and try to get just one feature out of two papers.
So I think those are the goals of the committee.
And at the same time, the language group and library group
want to move forward on certain um on certain papers that will
go into 20 but you don't want to pause the world and and not see anything will go after 20 right
you want to have some amount of iteration even though um even though you're trying to finish 20
you still want to work on future features just like you know pattern matching so that's another
goal of the incubators is really you know let's get a bit of iteration going so that, you know, once 20 is finished, we have stuff that's just not all brand new.
That's kind of ready to go or closer to ready to go than if we just hit the pause button.
So that was really the goal of the incubators.
And I think it was a pretty good setup that Herb had to try to make that happen.
And then he asked me and Bryce to chair the language for me and Bryce the library incubators.
And it seems like it's worked pretty great.
Throughout the week, we had 10 to 20-ish.
I think at the top, we had 30 people in the room for pattern matching at some point,
but not that many people compared to the other rooms. So I think that goal was met. And we saw,
I think for my group, something like 30 papers. So that was pretty productive, I think.
In a way, it sounds like the incubator would be the more interesting to a new person to visit because you are just, you know,
you're seeing like exciting new things coming in and kind of seeing the start of the process, I guess. Yeah, definitely. Although part of the goal of the incubator is to also
bring some of the worries that the later groups would have to those papers. And so I think it's
useful to, if you're new to the committee, kind of shop around the different groups, see what they talk about,
see what concerns they bring up, and also see who the...
We really have...
I don't think there's one expert who understands all C++,
but in the committee there's key people who understand things very, very well.
And knowing who to go to to get good input on a particular topic
is also really useful.
So it's useful to go around and see
who everyone turns to for particular
things so that you can
not waste everyone's time and just
talk to that person directly if you have questions
on a particular thing.
And
different groups really have different
worries. The final groups, the core group So, you know, and different groups really have different worries, right?
The final groups, the core group and the library group really care about getting the wording right,
integrating things into the standard really well,
whereas the evolution groups have more kind of higher level,
where should the language go or the library go, what should it do,
how should that feature look like?
And things like that.
So there's really different concerns.
And to me, it was obscure before joining the committee
how that part worked.
And just seeing it happen has been kind of really interesting.
It's a fairly well set up set of responsibilities.
And people kind of hold themselves to that.
It's really well done, I think.
So if you're in one of these discussions like an incubator and you're like well we don't really know how this would interact with this other feature and so and so is the expert on that but
that person's in a different room do you ever like go try to grab that other person real quick and
oh yeah definitely then okay so there's an ir an IRC channel where the committee kind of, we announce which paper is being discussed where,
and where people get queries for like, hey, could this person be sent here?
Or, you know, you're about to present, how about you show up type of thing.
Because the scheduling is pretty flexible, right?
There's so many people, you can't have everyone in the room at the same time.
And so you kind of give a heads up to people.
And definitely when there's a question where you'd like to have someone in the room at the same time and so you kind of give a heads up to people and definitely when there's a question where you'd like to have someone in the room
that that happens although what we usually try to do is we ask uh we we post a schedule
internally of which paper is going to be discussed roughly when and then uh people can either talk
to the chair or you know post on the the wiki or something like hey i want to be there for this
discussion so you try to ping them beforehand so that they're there so they kind of self-nominate talk to the chair or post on the wiki or something like, hey, I want to be there for this discussion.
So you try to ping them beforehand so that they're there.
So they kind of self-nominate.
But another thing we do, at least in the incubators,
for example, we did this with Stood in Bed,
is we say, well, we're not ready to afford this,
but this isn't the group to make this decision.
We'd like you to go to, for example, the language evolution group,
ask this set of questions,
then come back here with the answers, and we can iterate in that direction.
So we did that with std embed.
We did that with nodiscard.
So there's a paper that adds a message to nodiscard.
And we want to know, would library use it?
And if so, how?
Do they want to have a policy of this, or do they not want to use it for a variety of reasons right it's i think it's
important for us when we standardize a language feature to know whether a library would use it
and how and so that's part of the discussion so we forwarded that paper with that question
still open yeah it's interesting to me that you just brought up no discard because i just got a
message from someone on twitter saying that he started putting no discard throughout his code base and then wrote a,
uh,
lib tooling tool to go in and automatically apply it to every function that
he thought was,
was appropriate and ended up finding apparently dozens of bugs and their
source because they were ignoring things they shouldn't have been dropping.
And it seems like a lot of people are like,
wow,
no discard should kind of
almost be everywhere. I mean,
for some definition of everywhere.
Right, and it's pretty interesting. The committee
has been discussing this this week, and there's
places you don't want it, right?
You might say, well, it's a kind of
misfeature of the library
if it returns something
that's fine to ignore, right?
But something like printf is the
one nobody ever checks, right? Like nobody looks at the return of printf. But you might argue,
well, you shouldn't have been returning something if nobody checks it. But I guess the idea is also
should the default in the language be no discard, and then, you know, you would say maybe ignored in the cases where it makes sense to ignore them.
So I think what the committee is going to try to do is write a paper that tries to see
where you should apply it or you shouldn't as a kind of policy matter,
and then decide whether we actually do apply it to the STL
or whether it's just a proposal that implementers do this.
So I think Stefan, STL at Microsoft,
volunteered to write such a paper,
and I guess we'll see where it falls.
The thing to remember is attributes are kind of weird in C++
where compilers can just ignore them,
and so they're not really normative in that sense.
And so even if we put it in the whole STL,
it doesn't really mean anything.
Although, you know, I'd like
my SCL implementation to follow whatever rules
are there for attributes.
So, you know,
attributes are a bit weird, honestly.
Yeah. Yeah, but, you know,
it's almost peer pressure in a way.
Like, if two out of the three compilers are giving
us good warnings on no discard,
the third one kind of has to catch up and do it too.
Right, yeah.
And in a way, the warnings in language implementations
already give you warnings on no discard, right?
So if you do some comparisons or certain expressions
and they have a side effect but you don't look at the result,
then either with WL or something, you'll get warnings in some cases. certain expressions and they have a side effect but you don't look at the result then uh you know
either uh with w all or something you'll get warnings in some cases and so it's it's a bit
weird that we have a distinction between library and language where uh discarding things already
warns in some cases but not others right so um i guess it will be nice to kind of
find those bugs uh Notiscard.
So we mentioned stood-in-bed and Notiscard.
Were there any highlights out of the Oogie group?
Were there any papers that you kind of just forwarded straight on to Evolution?
Yeah.
So total, we had 60 papers on our plate.
We saw 30 of them.
There's five that we sent as- to EWG, and two that we
sent just to get feedback. And there's six that had no consensus to continue work as is, but we
gave feedback to the author on why we think that's the case and what could be done to change this.
So at a high level, the ones we sent to EWG were really just ready, and I think it was trying to make sure that they were ready before EWG saw them, because their schedule was really, really overbooked.
So first one, there was array size deduction in new expressions, which was kind of just an oversight that was missing in the language.
I think Bjarne pointed it out. Nobody really seems
to have run in this ever, but it was just a weird inconsistency. And so this was just fixing
something that Bjarne was just like, oops, I made a mistake like 30 years ago. How about I fix it?
So that was kind of neat. Then there was something that was almost a no-op where we're making string literals for char16t and char32t be explicitly UTF-16 and 32.
It was effectively the case, but we didn't say it, and the wording around it was a bit weird.
Okay.
And so the paper just fixed things in a way that was sensible. Then there was something about named character escapes,
which came out of the SG16 Unicode group that I really liked.
So the idea is, you know, when you do Unicode literals,
you can do, like, backslash U something,
or some syntax like that in a character literal,
and that inserts a Unicode character in your string.
And you have to provide the code point for that.
And I don't know about you,
but I don't know that many code points off the top of my head.
So I know that 20 is a space or whatever,
but that's about the extent of my knowledge.
So what that does is there's a Unicode standard
for named characters for Unicode.
And those are stable names.
And so it means you can put the Unicode character in your code using that escape sequence,
and you name it, and you say, like, this is the character I have.
And then it's obvious what the character is, right, because it's spelled out.
And it's useful because, you know, just backslash uesca escapes are really not obvious what you're talking about.
But it's also, if you just insert the Unicode character in your source code,
some of them look like other characters, and you kind of don't want that in a lot of cases.
So having the named escape makes it much nicer.
You can just say, no, this is not like the letter K.
It's the letter K for Kelvin, which is separate in Unicode for some weird reason.
So it allows you to kind of call these things out much more easily in your source code. So that's really neat.
So is this like an optional name that's with the code point, or you name it outside of the
string literal and then can use it inside of it?
So instead of using the code point, you can use that.
Oh, okay.
It's really like you say,
put this character here inside the string literal,
and so then it substitutes that in there.
I think for some reason... So it's the same thing as backslash you something,
but with the name instead.
Okay.
So that was a pretty neat feature.
It's adopting something that Unicode has standardized,
but Unicode gives you a lot of options on what you accept.
It allows you to ignore underscores and dashes and spaces
and ignore the capitalization and other things.
And so we had to choose which subset of that we were going to allow.
And I think we allowed something that is non-surprising,
but that works pretty well.
Non-surprising.
Well, yeah, because you can change the capitalization
and add spaces and dashes and whatever
according to the Unicode spec,
and then that's the same stable name, in effect.
And that seems a bit weird,
so I think we allowed capitalization to change
but not spacing and dashes and underscores.
Right.
Right.
So it sounds like the incubator groups
were definitely a success.
Do you think they're going to be meeting
at all future ISO meetings?
Is that not decided yet?
So the intent was to relieve pressure for San Diego,
but I think at least for the near future,
we'll still probably meet and we'll try to gauge how much influx there is for new papers, right?
And how much of a need there is.
So in the case of Oogie, we only met for three days out of the six that the committee usually meets.
And Oogie met for three and a half, if I remember correctly.
And so we'll kind of go on an as-needed basis.
And we might do telecons in the middle
to try to move certain things forward as well.
But if there's no need anymore,
then I don't think we should keep meeting.
The same way the other study groups
have shut down over time, right?
So the file system study group doesn't meet anymore
because the file system is already shit.
So I guess we'll see how things work out.
Okay. Okay.
I wanted to interrupt the discussion
for just a moment to bring you a word from our sponsors.
PVS Studio Analyzer
detects a wide range of bugs.
This is possible thanks to the combination of
various techniques, such as data flow analysis,
symbolic execution,
method annotations, and pattern-based
matching analysis. PVS Studio team invites listeners to get acquainted with the So I want to maybe talk a little bit more about some of the other core features
that had movement at San Diego.
There was a special meeting on modules before CppCon,
and they're now saying it achieved consensus.
So it looks like it will be making it into C++20?
Well, so it's still not clear.
So module hasn't been voted into 20 yet.
And we'll see what happens.
I think the upcoming meeting is the last one to make that happen.
But fundamentally, there was the TS,
which was championed by Gabby at Microsoft.
And the TS did a lot of things really well.
And at least within Microsoft's experience, they had a lot of very positive uptake of
their modules approach.
But at the same time, the implementation of modules that predates the TS that exists in
Clang has had a lot of success with Clang users in general.
Especially, you know, Google was really behind the way Clang did some of the features of modules.
And so was Apple.
And so we worked with Google to try to take the core of what we thought Clang did better than the TS.
And then Richard Smith wrote a proposal called Adam, I think,
that had kind of alternatives for what the TS was trying to do. And he was joined by
Nathan at Facebook, who works on the GCC implementation of modules. And so I think
by having those three implementations try different things out and see what works for
their user bases, we got a better
outcome through that meeting that we had about modules. And that's where we reached consensus
and where there's ongoing work to try to get the TS into a state where what we voted on is what the
wording reflects. And then that can be merged into the IS potentially at the upcoming meeting.
And so I think everyone's kind of happy with the outcome.
I think we got some really good changes in there.
Cool. Okay.
How about the coroutines proposal?
There's the core coroutines paper that was discussed,
and that's another one where it was said
there was going to be a decision to be made in Kona.
Right, right.
So coroutines as a TS got a lot of traction early on,
especially because Gore, who's also at Microsoft,
did not just the implementation in MSVC,
but he went and did an implementation in Clang
and optimized it through LLVM as well.
And he gave some pretty entertaining speech.
He talks about it.
And that really helped the TS move forward much faster because he was willing to do all
that work to show that it's not just a one implementation thing.
Now, the problem was as people start using coroutines, they realize that they had issues
with some of the capture, right?
So it's not obvious when a coroutine is going to capture certain things
and cause heap allocations.
And it looks a bit different from what lambda syntax looks like.
And so there was a feeling by some people that coroutines should kind of look more like
what lambdas look like without being very clear on what that meant, right?
So the core coroutines paper
tried to explore some of that.
And at the same time,
there's maybe a dozen or so customization points
in the current coroutines TS.
And it's a bit weird.
The language feature of coroutines
reaches into library-like things
to do customization.
It looks a bit weird.
And there was a feeling that we might be able to do something better,
but not really clear on what that would be.
And so that's the other thing that people were interested in exploring.
So that's what Core Coroutines tried to do.
The first version of the paper was really rough.
It was very hand-wavy and didn't really have enough concrete stuff to convince people.
And so Core Coroutines was trying to move forward, but barely didn't make it to the standard. And now Coroutines is in a better state.
The authors have been working really closely with Gore to address some of the issues they have.
And at the same time, some of the Facebook people brought up worries that they had with
just kind of the asynchronous natures of Coroutines and how you would use them and integrate that with the library and other stuff.
So that's another thing that's ongoing of trying to get all of these actors
really happy about what coroutines do,
so that if we do vote it into 20,
it's a really solid feature that we don't regret.
And there's a lot of demand for coroutines.
People really want it.
So it's a bit difficult to say,
well, let's hold off for 20, right?
Like maybe we want to see what it'll look like
with parallel algorithms.
Maybe we want to see what it'll look like networking.
Or maybe we don't.
Maybe we believe it's good enough to ship as is.
So I guess we'll see in Kona how things fall.
And I have to ask you about the paper you presented, Signed Integers, R2's Compliment.
Right.
How did that go?
Yeah, so I presented that paper a while ago, and I gave a CPPCon talk about it this year
as well, and it went really well, surprisingly.
I thought I would get more pushback so initially um i i came at it at a kind of a mild rage because it causes like that type of thing
causes security bugs as i talk about in my my cvp con talk and i didn't know of any architecture
that was to complement that was relevant to c++ nowadays and so i kind of started doing some
research to figure out well, why do we have...
Well, first, what liberties do we have
in C and C++ implementations
with respect to sign-in-vitro representation?
And second of all, does anyone take those liberties?
What does it look like when people try to use them?
And then if we made changes,
say we said everything was just complement,
what changes could we make,
and what effects would that have on the language and on users?
So initially I was proposing that not just storage be to complement,
but also arithmetic, meaning that overflow would wrap.
And that received huge pushback, partly because in a lot of cases,
when you have a bug, having wrap around wouldn't actually fix your bug.
And I go into examples of that in my
CPUCon talk where
there's bugs where wraparound
was the behavior that occurred
and the bug was still there
like in Donkey Kong
and Pac-Man
you have bugs like that
and having wraparound wouldn't fix anything
same thing with like
the Ariane 5 rocket that exploded a few years ago.
Having wraparound would not have prevented the rocket from exploding.
So one of the worries people had was,
well, you're going to find wraparound, it's not going to fix any bugs,
but it'll make life harder on sanitizers,
because then they can't just trap when wraparound occurs
and still be able to conform to the limitation of sequence space.
That's one worry that people had.
And at the same time, there was a worry that it would hurt performance.
And there's numbers that people have measured in the range of insignificant to 8, 30% perf hit
if you were to define integer overflow to wrap.
And I think we can make up some of that per fit, but you can't just say,
well, we made the language nicer,
but your code now runs 10% slower, right?
So it's something that people weren't willing to commit to
making integer overflow well-defined.
And so what we did instead is we defined the storage,
but not the arithmetic.
And the goal is instead to have library features
that allow you to express the type of behavior you want on overflow,
whether you just want to track where the overflow occurred
and do something later, whether you want it to trap
or whether you want it to wrap or something like that.
So I think as a library feature, we're going to try to do that.
And the numerics group has been working on that for a while.
So I presented the paper presented paper went really well I
Aaron Bowman presented it to the C committee they they want the same thing
to be adopted for C and and then he ends Mauer took my wording which was kind of
expressed in terms of engineering wording and transformed into a more
mathy thing so he wrote a paper that was
eventually adopted that made everything more mathy and in my mind less intelligible but but that's
that's the way the standard is usually uh created and so so you know it it should be equivalent in
all cases but uh you know the core group thought it was really important to have a more mathy description
rather than engineering description.
So it is specified now that that part has been accepted,
it is specified that storage is two's complement for integers.
Right, yeah.
Okay.
And then now this gives the library,
core libraries, something to do
to give us more tools around that, basically.
Exactly, yeah. And that work
was ongoing before I presented my paper,
so it's not like I made it happen.
I think I just kind of precipitated things
and told people, like, okay, well, if the language is not going to do it,
the library really should be doing it.
So Numerix is looking at that pretty
intensely. I think they met in San Diego
at the same time
as my group was meeting, so I wasn't able to attend, but I think that's one of the
things they've been working on. Do you have any idea what that will ultimately
look like? Will we have new strongly typed signed integer
wrapper type things? Yeah, I would hope.
It's probably something that you can customize through templates or whatever, what type of behavior
you want, whether you want wrapping or trapping
or something like that.
All right.
But, I mean, I've worked on a handful of codebases
that have this type of templated integer
and floating point, you know, safe helper.
So, you know, Clang has some with its AP int class
and AP float class,
so the arbitrary precision ones, those are
used to implement things like constexpr and stuff like that.
And so those allocate when you overflow and things like that by just being effectively
a big int implementation.
But then Chrome and WebKit both have implementations of effectively the same thing, which allow
you to detect whether overflow occurs.
And so any untrusted input in browsers
just goes through that type of class
because you just do the arithmetic.
You don't trust yourself to do it right.
You don't trust the user to not be malicious.
And so it's kind of your security boundary.
Anything that comes from potentially malicious sources,
you feed it through that integer class
and you check whether something bad happened or
not. And so, and you know, Firefox is the same thing, I think it's called SafeInt that they use.
So there's a handful, and I think I mentioned them in my talk as well. But it's a very common
thing. It's just a question of what do we want the standard one to look like?
That's assuming the performance characteristics of it are good. Or as you say, if there's a templated set of things that we can choose an emulator I was working on recently, I had to deal with integer overflow and then set the emulated CPU flags appropriately. And the best way I could come up with to do that was to do a 32 bit math and a 64 bit-bit value and then check to see if any of the height order bits were set basically and also compilers have built-ins as well for
that if you look at built-in add and things like that so they they can do that that work for you
as well okay the way that people do it or or you cast the unsigned you do the arithmetic and then
you cast back the signs right so a lot of of the instructions like add and multiply are exactly the same.
So that's another approach to doing it.
Right.
So going back over some of the other features that made it through core and evolution, there
are a lot of changes made to constexpr.
Is that right?
Yeah.
Yeah.
So there's a big push to make everything constexpr.
And so Louisian, who works at apple and maintains
libc++ for us has been instrumental in that and he's he's he's got a lot of support from people
like david vandervoort who's um who's at edg and uh they're just attacking every single constexpr
thing one at a time to try to make it possible to have things like constexpr string and constexpr
vector and other things like that and so um some of what they they they did was just you know allow unions to be constexpr so that
uh an implementation of string that uses the small string optimization can be constexpr right so so
and part of the concern they have is well they can't can't break their API, their ABI, through C++20.
And so you want whatever is implemented right now to just be constexprable, right?
So union was one of the requirements here.
And so what that does, basically, is the compiler has to track what the active member of a union is through constexpr evaluation,
which initially was thought to be pretty hard, but people kind of experiment experimented a bit and it turns out it's not that hard
especially for DG
because DG emulates a lot of other compilers
and so the worry was that for them
it would be impossible
but they tried it out and it's not
so they were really happy to go with that
but there's other stuff like
constexpr try is something
so you can't throw in a constexpr context
but you can have try and catch and if you encounter them it's still a valid constexpr try is something. You can't throw in a constexpr context, but you can have try and catch,
and if you encounter them, it's still a valid constexpr.
Then constexpr dynamic cast, constexpr type ID,
constexpr pointer traits,
and there's a handful of other stuff.
Eventually we'll get constexpr allocations
in the standard as well.
So that will allow you to eventually have
just vector pushback and other stuff
all be constexpr.
Do you expect...
I'm sorry, go ahead.
Yeah, go for it.
Do you expect that the constexpr allocation will make it into 20,
or do you expect that's something we'll see later?
Oof, I'm not sure.
I think we...
There's someone in my team who started working on implementing
some of those aspects
for LLVM.
Especially for allocation,
I think we might need some
experience to make sure that what we're doing
makes sense.
You want the allocation to, whatever you've
created as a vector, then be usable at run time.
And so I'm not
sure it'll make it to 20 because
papers that were presented in San Diego to the language group can potentially make it to 20.
And I think that one was discussed, but it's a bit tight.
So I guess we'll see.
We still have hope that it might, but it's really not clear, and I wouldn't put money on it.
Okay.
Another feature I wanted to talk about was the revised memory model.
What is changing exactly?
Right.
So SG1 is the group that does concurrent parallelism, chaired by Olivier Giroud, who you've had
on semi-recently.
And part of what they do is they own like Atomics and Executors and a bunch of other stuff like that, like futures and async.
And part of the memory model for coroutines and stuff.
And what they've done over time is they've tried to have a formal proof that the memory model holds on existing architectures. So there's a guy who did a PhD under Peter Sewell called Mark Batty,
called Mathematizing C++ Atomics or something like that. I don't quite remember. But the idea was
that when you write atomics and you don't have a data race, then on every architecture that C++
targets, you can prove that your program behaves as if it were sequentially consistent.
So the model is called Sequentially Consistent for Data-Race-Free Programs
that was published in the early 90s, I think, by Sarita Advay.
And the idea is to use theorem proving to prove that that's the case.
It's pretty rare that you can prove anything
about programs, and it's kind of interesting that they've
done that for Atomics, and especially since
Atomics are very
unintuitive to use
in a lot of cases, it's nice that
if you stick to these rules,
and release, acquire, and sequence
consistent things, then your program will behave in this bounded set of ways on all the architectures that you target.
Now, since theorem provers are also themselves programs,
it turns out that there was a flaw in some of the theorems.
And so they found that flaw, and they've been working at kind of fixing things, right?
So the idea was, well, some really weak architectures like power and NVIDIA's GPUs,
weak in the sense of the memory model,
don't actually obey what C++ said should be obeyed.
And so we can either say that they're non-conformant
or we can fix the memory model to make them conformant.
And what you end up doing there is looking at the programs that are non-conformant, or we can fix the memory model to make them conformant. And what you end up doing there is looking at the programs that are
non-conformant, saying, would anyone actually
write this code? Does anyone rely on this?
And it turned out that nobody would.
It's programs with four threads
doing weird things with
different accesses to
the same variables using acquire, release,
and relax, and sequential consistent.
Nobody does this. And so they found a way
to fix the memory model to
still be compliant on those architectures
without degrading performance.
They're really minor fixes.
It's really kind of a
niche topic, but it's really
important for hardware people to know
that people
writing C++ code will have
expected outputs when that code is executed
on their architectures.
And so for them, it was really important.
And they spent a bunch of time trying to fix this and finally came up with something that
everyone thought was palatable.
So I think it's pretty cool if you're into that type of topic.
For most people, you can just keep using Atics and it'll still do the same thing.
Right.
Right.
I thought we had covered basically everything.
Oh, nested inline namespaces are left, huh?
Yes, that's another thing that we discussed, and I think we moved to SuperPost 20,
and I'm not a fan of that one, I have to say.
Okay. So the idea is you can, when you have inline namespaces,
you want to be able to nest them and kind of say like,
oh, I'm in this namespace right now, but it's inline.
And so you can say right now in C++17, I think, like namespace,, namespace, foo, colon, colon, bar, colon,
colon, baz, and then open it curly.
But you can't do that if it's inline, right?
Okay.
What this does is it allows you to put inline in there to say, oh, actually, it's inline.
First, the syntax, I think, looks kind of ugly, but it's not that bad, right?
But it's not really fixing what inline namespaces wanted to fix, is my opinion.
And the goal of inline namespaces was to be wordy, right?
So I kind of don't mind having to type more stuff when using an inline namespace, right?
Like, I agree it's convenient to type less.
But in this case, that was kind of the point.
So it's not a big deal.
At the end of the day, I just won't use that feature and
whatever. But some people really wanted it, and it got enough consensus in the language working
group that it moved forward. So that's kind of how the standard works, right? Some people think
some stuff is important. Some people think it's not as important. And you kind of want to allow
people to get the features that they want
outside of language.
If you don't like it, you don't have to use it in this case.
Right.
Well, is there anything that was
a highlight for you, JF, that we haven't
gone over yet?
Well, so one thing that I was really
enthused about is I wrote a paper about
deprecating volatile, and I
have been joking
about this for a while and finally pulled the trigger and wrote that paper. And well,
first I wrote the paper with the title deprecating volatile just to get people to click on it
because they're like, well, you what now?
Clickbait with the standards committee, in other words.
Exactly, exactly. And I think that the subtext should really be deprecating most of volatile
or deprecating the bad parts of volatile.
And so I think I made a pretty strong case,
first for why volatile is the way it is, right, in the paper,
and then for why I think we should deprecate parts of it in C++ and in C.
So I'm going to have to try to talk to the C committee about some of that volatile stuff.
Now, I think everyone agrees if you design a language
from scratch, you wouldn't design it with volatile
as it is in C++, right?
You'd have something like a volatile load and volatile
store, and volatile
non-terr load and non-terr store, and that
would be about it. You'd participate
in the memory model. So languages
like Rust and other languages
do stuff like that already. So volatile has useful semantics. I try to go through them in the memory model. So languages like Rust and other languages do stuff like that already.
So volatile has useful semantics.
I try to go through them in the paper.
I want to preserve those,
so the valid uses of volatile,
but I want to remove the ones
that are super error-prone
or that are really non-intuitive.
So for example,
if you have a volatile variable
like B and C
and you do A equals B equals C,
how many volatile loads and stores did you perform doing that? Well, it depends if you're on C and C, and you do A equals B equals C, how many volatile
loads and stores did you perform doing that?
Well, it depends if you're on C, C++, and which version of the language you're using.
So stuff like that is just really unintuitive.
And so my paper goes through the stuff I think is important that we want to preserve, even
though it might be distasteful and we do it some other way.
Like, I don't want to break those uses, right?
What low-level code like, you know, kernels do, we want to preserve those.
And at the same time, what's unintuitive, what's usually broken when you write it,
what should you not do?
And I had a bit of fun writing the paper, too.
So I kind of quoted one of my favorite books from Patrick Crossfuss in the book,
in the paper, kind of using some of his quotes.
I thought that was pretty fun. And he seemed to enjoy it on Twitter, too. That was pretty great.
So overall, the paper got really good reception from the committee. And I'm going to send an
update for the next meeting proposing actual wording changes, not just, hey, how about we
do these things? So I got really strong support from the committee for that.
That was my personal highlight, because I was pretty excited about it.
It's in my vein of trying to deprecate things out of C++
that we shouldn't have anyways.
So without breaking people's code,
I'd like to simplify the language
and bring it to look like
a more mature language.
A more well-thought-out language. Yeah, it looks like a more mature language, like a more well-thought-out language.
Yeah, it looks like a very mature language at the moment.
Right, right. It has all the words.
I'd like it to be mature and pretty.
And then Bjarne was really clear.
If you've read Design Evolution of C++,
from early on, he put Valtel in C++,
but he wasn't really happy about what was there.
People felt like they really needed it, and they did.
And so he wanted to have a solution for them
without breaking compatibility with C.
And that's really why it's there.
And then it kind of feature crept
into looking like what const looks like, right?
So into qualifiers and other stuff,
and you can qualify member functions with it,
and arguments and return values.
Like, what does it mean to have a volatile
argument? Well, for
function passing purposes, it doesn't mean anything.
The calling convention is exactly the same. It doesn't
show up in the ABI. It just means something
local to the function itself. And same thing
for a volatile return. It doesn't mean anything.
You can, in fact, have the
for declaration not have the volatile and have
the actual function definition
have it, and it's totally valid C++. And so there's a lot of really weird stuff that volatile has
that's kind of accreted through what was perceived as consistency with const.
And I think that's kind of a misfeature that I'd like to get rid of.
And I think if we could take just a second to back up and make sure if I maybe I could try to summarize what you said.
You want to remove most of volatile, but leave some ability to do a volatile load in a volatile store.
And for our listeners who are not aware, if you can correct me if I'm wrong, please.
The only legitimate use of volatile is when you're talking directly to hardware.
So this volatile load and volatile store would be,
I need something at this address right here, please.
Well, yes, there are actually a few valid uses of volatile,
and I go through them in the paper.
So load and store are really the main thing.
It's not normative, but what the language says in notes
and in the design intent papers that accompany the language
is that if you wrote
volatile in the code, then the abstract machine should do a load or a store, right? Like that's
what should happen. And it's really hand wavy how it's specified, but we kind of all understand
what's going on, right? The idea is like you might have a register that has special meaning or a
memory location that does special things when you touch it, and you really want it to happen.
So that's the main usage that I've seen for volatile. But there's uses for something like
shared memory, where alighting a load or a store through shared memory could be a security issue.
And so I have a few examples of that in actual exploits in the paper where a hypervisor or
browser used shared memory but failed to use volatile,
and the compiler just removed the loader of the store.
And so that means that you have a time-to-check,
time-to-use bug in your code.
So you can cause a race to happen if you're on the other side of the shared memory
and you're a malicious program,
and you can cause a higher privilege process
to do something it didn't want
because it loads the size twice or something like that.
Then it's valid to use volatile
for setjump and longjump to preserve
values across the setjump.
So when you longjump back, the value is still there.
And that's
important because setjump
can return twice, which no other function
can. And so the
compiler
has to know that it has to preserve that
value across that twice-returning
function.
And then it's also valid to use for signal
handlers. So that's the other
use case for that.
Signal handlers kind of behave the same as
setJump, longJump in that case. It's like
well, the value should be there.
Now, the thing to keep in mind is volatile can tear.
And volatile doesn't participate in the memory model.
It's not sequentially consistent or anything
unless you use volatile atomic.
And so it's really hard to use properly
if you're trying to do certain things, right?
And there was a feeling initially pre-C++11
that volatile could be used to make threads happen.
And in a way, it was, right?
So Microsoft's implementation of C++ on x86,
to this day, guarantees sequential consistency
for volatiles who load in stores.
It might still tear on their implementation,
but it'll be sequentially consistent if it doesn't.
And if it tears, then it tears in a predictable order.
And so there's kind of a wide breadth of implementations
of volatile. And it's really,
especially when you're trying to deal with hardware, it's
really up to the
vendor of your compiler to tell you what
that specific thing means on
that specific hardware.
So yeah, it's a bit tricky
and I'd like to make that a bit less
tricky. Keep the good uses and just
get rid of the bad ones.
Okay.
All right.
Okay, well, it's been great having you on the show again, JF.
Where can people find you online?
Online?
Well, so I have my usual Twitter account
where I just tweet jokes about C++.
I don't really tweet anything serious.
People take me seriously, but I don't. tweet anything serious. People take me seriously, but I don't.
Please don't.
Otherwise, I don't know.
I participate in LVM quite a bit.
And if they want to meet in person, I'll be at the LVM meetup and other things like that.
So CppCon is another place where I hang out.
I try to not go to too many conferences because it's kind of hard with a kid to do that.
But I'll usually show up at CPCon in the standard meetings.
Between the standards meetings, the LLVM meetings,
there's not a lot of time left for conferences, I would imagine, besides...
Right, well, the LLVM meetings are more kind of LLVM socials in the Bay Area,
and so they're kind of once a month.
I usually go there, bring my kid.
Okay.
And it's not just LLVM people.
It's more like people interested in LVM,
but most of the people who contribute to LVM in the Bay Area usually show up,
like, semi-frequently.
So it's kind of neat to meet people you effectively work with
in the input source community and be able to move things forward
more easily than through email or IRC or something like that.
And because often our colleagues and other companies
also work on the Standards Committee,
we also end up kind of talking about Standards Committee-related stuff as well.
So it's kind of an interesting little group of people.
Cool.
Okay. Thanks again, JF.
Yeah, thanks for having me.
Yeah, thanks for coming.
Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast
please let us know if we're discussing the stuff you're interested in
or if you have a suggestion for a topic
we'd love to hear about that too
you can email all your thoughts to feedback at cppcast.com
we'd also appreciate if you can like CppCast on Facebook
and follow CppCast on Twitter
you can also follow me at Rob W Irving
and Jason at Lefticus on Twitter.
We'd also like to thank all our patrons who help support the show through Patreon.
If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast.
And of course, you can find all that info and the show notes on the podcast website at cppcast.com.
Theme music for this episode was provided by podcastthemes.com.