CppCast - Modules Present and Future
Episode Date: June 18, 2020Rob and Jason are joined by Gabriel Dos Reis from Microsoft. They first discuss the recent articles about Microsoft switching from C++ to Rust and let Gaby set the record straight. Then Gaby talks abo...ut the final state of Modules, how Microsoft is using them internally, and Gaby's plans for the future of Modules and much more. News r/cpp comments on Web Assembly Use Cases Microsoft: Rust is the Industry's 'Best Chance' at Safe Systems Programming Why is std implementation so damn ugly Italian C++ Conference 2020 Videos Links Practical C++20 Modules and the future of tooling around C++ Modules with Cameron DaCamara Peeking Safely at a Table with Concepts with Gabriel Dos Reis CppCon 2019: Gabriel Dos Reis "Programming with C++ Modules: Guide for the Working" CPPP 2019 - C++ Modules: What You Should Know - Gabriel Dos Reis RustSec Advisory Database Sponsors Clang Power Tools Use code JetBrainsForCppCast during checkout at JetBrains.com for a 25% discount
Transcript
Discussion (0)
Episode 252 of CppCast with guest Gabby Dos Reis, recorded June 17th, 2020.
This episode of CppCast is sponsored by Clang Power Tools, the open-source Visual Studio extension on a mission to bring Clang LLVM magic to C++ developers on Windows.
Increase your productivity and automatically modernize your code now.
Get ahead of bugs by using LLVM static analyzers and CppCore guidelines checks from the comfort of your IDE.
Start today at
clangpowertools.com and by jetbrains the maker of smart ides and tools like intellij pie charm and
resharper to help you become a c++ guru they've got c-line and intelligent ide and resharper c++
a smart extension for visual studio exclusively for cppcast jetbrains is offering a 25 discount
on yearly individual licenses on both of these c++ tools which applies to new purchases and Thank you. In this episode, we discuss whether Microsoft is switching to Rust.
Then we talk to Gabby Dos Reis from Microsoft.
Gabby talks to us about the present and future of modules, contracts, and more. Welcome to episode 252 of CppCast, the first podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
I'm all right, Rob. How are you doing?
Doing okay. Doing okay. I don't think I have any news to share myself. You?
No, not at the moment. We're still mostly in a wait and see what happens with CPPCon and all of the things kind of perspective for my part.
Yeah, I will say we're a little disappointed that our local government here is not treating the ongoing pandemic more seriously.
Our numbers are going up and they really
should enforce a mask wearing policy. My wife and I have been writing our governor and mayor this
week. We are in Colorado supposed to always wear our mask when we're out. And I would say Jen and
I did some shopping yesterday and it was almost 100% of people had a mask on mask on yeah it's much better than what we're
seeing here although i'll say some of them were wearing it you know below their nose or below
their chin but it was so there's an attempt yeah yeah okay well at the top of episode like a piece
of feedback uh we got a lot of reddit comments on last week's episode with Ben Smith on WebAssembly.
And this one comment on Reddit was just asking, what is the use case? And I guess we didn't talk
too much about use cases. We did talk a lot about like emulators and virtual machines as possible
use cases. But I think maybe that's not the most intended use case. And there is a link they can look at on WebAssembly's website
that shows a whole bunch of use cases.
And a couple others were listed in this Reddit thread,
so you can go take a look.
Sounds good.
Yeah.
But basically, you can think of old Flash video games.
That's a real good one where instead of using Flash
or other little embedded scripts, you can now use WebAssembly, things like that.
Mm-hmm.
Yep.
Okay, well, we'd love to hear your thoughts about the show.
You can always reach out to us on Facebook, Twitter, or email us at feedback.sbs.com.
And don't forget to leave us a review on iTunes or subscribe on YouTube.
Joining us today is Gabby Dos Reis. Gabby is a principal software
engineer at Microsoft where he works in the area of large-scale software construction tools and
techniques. He's also a researcher and longtime member of the C++ community, author and co-author
of numerous extensions to support large-scale programming, compile time, and generic programming.
His research interests include programming tools for dependable software. Prior to joining
Microsoft, he was assistant professor at Texas A&M. Dr. Dos Reis was a recipient of the 2012 National Science Foundation
Career Award for his research in compilers for dependable computational mathematics and
educational activities. Gabby, welcome back to the show.
Oh, good morning. Good to be here.
I'm curious about this NSF Career Award. So that's just acknowledging what you've done through,
it's not one specific thing, but through the course of your whole career?
So it is for early career faculty researchers.
Like, let's say, in your first five to seven years career,
and they look at what you've done,
they kind of program you know, programming you're
conducting, sorry, research you're conducting, and you get to write a grant and explain what
you want to do, let's say, for the next 10 years.
Okay.
And they look at your track records, and, you know, it's a program that is one of those
high-risk, high-reward. So it is very competitive.
Okay, so it's been almost 10 years then since you received the award.
How are you looking?
Oh, so yeah.
After I left A&M, Texas A&M, I had to relinquish the award.
And it was good until I left academia.
But interestingly, all that I've been doing at Microsoft
is right on target.
That same year, I had another grant from NSF2
to work on modular generate programming concepts.
So we got concepts into C++20, and we got concepts into C++20,
and we got modules into C++20.
So I think I can write back to NSF and say,
hey, I've done my part.
That's awesome.
Awesome.
Okay, well, Gabby, we got a couple news articles to discuss,
and then we'll start talking more about some of that work
you just mentioned, like modules and concepts and contracts okay yeah all right so this first article uh which
we're definitely going to want to get your opinion on gabby is uh microsoft rust is the industry's
best chance at safe systems programming and this is on a website called the new stack which i i'm
not familiar with but yeah uh the article is covering, I guess, the viewpoint
of this one Microsoft employee who
did a talk, Ryan Levick. He's a cloud developer advocate
and he talked about how Microsoft is starting to switch from Rust
or from C++ to Rust for some of its infrastructure software.
And I wanted to hear from you.
How much is actually changing in Microsoft with C++?
Is your job on the line?
Yeah.
So I think the first thing to say is that the title of that article,
that wasn't the only article that was written on a topic.
There were quite a few.
But that title was very sensational.
Like, you know, they need to have people click on this stuff.
But, you know, going to the core of the topic,
what I think I wanted to say is
Minecraft is a really big company and employs a lot
of smart people, a lot of them.
And the thing you have with a lot of smart people is you have a lot of diverse point
of view ideas.
And right now, what is going on is that there is a group of people in MSRC.
So MSRC is the Microsoft Security Response Center.
And also in the Windows division.
Principally in the Windows division.
Looking at, so this is purely learning phase.
Like today as I'm talking to you,
there is no product shipping of Windows
or Microsoft product shipping with recent rust.
The division I'm in, which is the developer division,
the division in charge of producing developer tools
and supporting for Microsoft,
is very much committed
to C++. It is a core programming language that Microsoft supports and that Microsoft promotes. So no, we are not telling people go and switch from C++ to...
No, no. That article was very much sensational.
And I think I saw some of the discussions on Reddit
where someone says, oh, he was actually misquoted.
Ryan was misquoted in terms of what he was saying.
So he was saying something,
and then some journalist or whatever was extrapolated.
So, yeah, there's a lot of stuff in there.
So I haven't received any signal that I should be updating my CV and password.
And I will tell my friends using C++ in general,
not just Microsoft Compiler, but C++ in general, not just Microsoft Compiler,
but C++ in general,
then no, you're fine.
You're good.
That being said,
I think there is a core to the problem that I think far too often
we don't take sufficiently seriously,
which is that, yes,
there are problems with memory safety,
type safety, and there are some studies in the Chromium code base where they say they
have two-thirds of their security bugs related to components written in C and C++.
I think they have been also popularized. And that matches some of the experience
that we've seen internally at Microsoft.
So there is a real problem
that some components written in C++
are subject to vulnerability.
And the real question we should ask is why.
Now, it is easy to go
from, oh it is written in C++ therefore C++ is bad to why is it
actually happening and there are a lot of you know factors, contributions
there that we need to look at very calmly like little headed and
scientifically like we are engineers we're supposed to be using science first look at very calmly, like little-headed and scientifically.
Like, we are engineers.
We are supposed to be using science first before we jump onto emotion or, you know,
some other hyperbole stuff.
So, you know, the first thing is, why do we have this? You can say, well, the language doesn't provide enough you know adequate uh functionality
features to support you know people writing good code right and oh yeah look at ross ross has
borrower checker and then you don't get these things i think when people look at the reality
it's more uh much messier um you know you you can go on on Rust's website, I forgot the name, but I'll find the link and send
it to you. And you can look at what are the vulnerabilities associated with components
written in Rust. And you'll see the usual suspect, use after free, type confusion, that kind of stuff.
Basically the same problems that we have in C++. So the real question is, what can we do in C++
to put those issues at base?
Bjarne and I have been looking at some ways to get there.
So essentially, a good chunk of my time in coming months
would be specifically to work in there.
And to be honest,
a lot of people have looked at this in a lot of work.
Herb has published a series of papers
and I think he even had a big talk
on checking for memory safety and that kind of stuff.
So a lot of people have been looking at this,
and I think it is time now for us,
in particular Microsoft,
who produced a lot of C++ developer tools
to put more muscle behind this issue
without necessarily telling people,
go and rewrite your code in Rust.
It's a fine language.
I actually think here's something that I like.
So C++ has this notion of resource acquisition is initialization, REI, right?
And any of us guys took it and learned with it and are uncompromising about it, right?
I think when I was in academia, we were talking about academia earlier.
When I was in academia, one of the things I used to like to do is how far can we go
if we just stuck to this principle, right?
I think one of the contributions of Rust is showing us, hey, this is how far you can go,
at least so far.
This is how much you can do, right? so far. This is how much you can do.
And then you have to take into account some other factors.
How do you design a programming language
that is used by millions of developers?
Now, the numbers I've seen through public surveys
seems to indicate that we have more than 5 million C++ developers.
5 million.
So how do you design a programming language that requires 5 million people?
That can be used by 5 million people.
And if we want to go and dump very advanced type theory
into the hands of these people, it's going to be hard to scale.
If we require PhDs, we fail. and have fury into the hands of these people, it's going to be hard to scale, right?
If we require PhDs, we fail, right?
Because most PhDs,
they don't actually want to write programs.
They want to be managers or something or highly leveled and stuff, right?
And, you know, I like the idea.
My son, you know, started not using C++.
He's getting to high school.
I like the idea that high school, you know is that not using C++ he's getting to a high school I like the idea that has cool you know people can can use C++ and write
programs I don't want them to have to know advanced stuff here before they can
write reasonably good program right so we have to do something in this space of
making C++ simpler to use and by by default, when you use it,
you write code that is dependable.
So what do I mean by dependable?
So dependable doesn't mean that
it is always bug-free or faultless.
It means that even if it has bugs,
it has enough mitigation there.
It doesn't take down the entire server,
applications, and so forth.
It may have bugs, but
on average,
you get everything done.
These days, when you have cloud services,
they don't guarantee
100% availability.
They give you the three nines.
That's dependability.
Of course, you can't put anything in a language
that would give a human logic error.
And some of these security bugs we get,
they are not all because of the language,
you know, lack of functionalities.
Some of them are just, you know, logic errors.
So, yeah, I guess going back to the question you asked
about Microsoft ROST, yes.
So I think Microsoft will always be investigating
new programming languages.
Like, you know, it is a software company.
That's based on technology.
You can't just say, oh, we have enough programming language.
We are not going to look at it.
That's not going to happen.
So it will always be investigating
new programming language to look at
what can we do to solve
these problems that we're having.
And I think Rust is
one such language, but it is not the only one.
I think there was another article in the news
that says, oh, Verona is also being
looked at. Verona is also being looked at.
Verona is a research language at this time that's being developed by Microsoft Research,
principally on the campus in Cambridge.
It's also, it's another take on this idea of resource management. So Rust does everything statically.
That is the aspiration, right?
So there are, you know, we can actually talk to the guy, Matteo.
It tries to look at, well, you can't do everything statically.
Like in Rust, if you want to write a static data structure like graph,
you know, it requires some Kung Fu style, like black belt Kung Fu to get there.
But if you want a language that is used by at least five millions of people,
they should be able to easily write cyclic data structures.
So a very nice idea is some of these takes we can actually
move them to runtime. So, they design a framework runtime system that will support this notion of
linear logic and more affine logic. So, they are looking at it and we'll see what how how far can go and then you get to
the point where you have to do engineering which is you have the math the logic solid then you have
real life you have to you know bring things together to make it scale. Right. Yeah. Okay. Well,
uh,
we've got two other quick articles.
Uh,
this next one is,
uh,
just a discussion on Reddit and it's why is standard implementation so ugly?
Uh,
and yeah,
there's some interesting discussion going on here,
but the short answer is,
you know,
they have to use like the leading,
uh,
double underscore because it's reserved for the standard library.
Is that right, Gabby?
Yes, that is the primary reason.
And the question, why?
You always have the why of the why.
And the why of the why is, well, when we started,
we didn't actually have good namespace support protection.
We have header files technology and we have
macros.
And as soon as you
have that, you
get to protect yourself. And then
the standard carves
this, you know, what
we call namespace, like you have
certain identifiers that
you say, oh, users
can't touch it.
So this is a gentleman contract.
People can always go and break that contract and they get what they deserve.
And in return, the implementations,
standard library implementation,
get to use those identifiers to do whatever they have to do.
And the code is really, really ugly.
I don't write a lot of standard library stuff these days.
I used to do that when I was working on GCC,
and it was just ugly, like underscore, capital something.
Well, hopefully, as we're moving forward,
modules will help.
But as long as we have to support header files and macros,
you get that thing.
But there exists a future, maybe,
where they don't have to do that anymore
with the modules?
Yeah.
If you start a new
project,
and you're very
systematic about
usage of macros,
no, you're not going to use macros. You're going to use something else. There are usage of macros, like, no,
you're not going to use macros. You're going to use something else.
There are some good macros, like the ones where
you have some kind of local if-defs.
Those are
okay. But for the vast
majority, you don't need macros.
And if you're very systematic about that,
then you don't need the ugly stuff. You can
just write really beautiful code.
Like, really beautiful, orgasmic code can show people.
Okay.
And the last thing is just a note
that we've been talking a lot about conferences being moved online.
And the Italian C++ conference was just a few days ago.
And their videos are all available on YouTube
for anyone who wants to watch it
that didn't actually watch any of this live.
So it's, I think, worth pointing out for those of our listeners who listened to the episode of Nico a couple weeks ago that he was one of the keynotes here.
And he is diving into some of his hidden features and traps of C++ move semantics.
So it's kind of dovetailed off of our conversation
and the upcoming book that he has.
Yeah. All right.
So, Gabby, last time we had you on Modules was still,
I mean, that was, what, four or five years ago we probably had you on first?
Yeah, it was at the beginning, yeah.
Right. So Modules was definitely not done yet,
but is it now set in stone for C++20?
Is there any errata still being addressed?
Well, so, yeah, the first thing is C++ for, no,
module for C++20, done.
Okay.
So, like the rest of the language itself,
you always have some, you know, wording to look at and so forth.
But in terms of functionality and usage, no way we have those things.
I expect that in the coming months and years, we will be looking more at how we leverage modules.
Like what tools do we build on top of the infrastructure that we have for modules?
What are the capabilities?
Like, in Cologne, for example, so that was almost a year ago, at the C++ meeting in Cologne
in July 2019, Richard Smith at Google came up with a proposal
that is leveraging
modules for doing
packaging in a way that is very principled
and I really like that
proposal
so we haven't gotten back to it because
we have been racing since then to
get everything lined up
for you know we
released the draft in Cologne, then we got National
Bodies comment.
So in Belfast, we were all busy addressing National Bodies comment, and we did the same
in Prague, where we signed off on C++20.
So we haven't gotten time to, you know, going back to, hey, we have these things, what can
we build on top of it to help the infrastructure community?
And we're looking forward to be doing that in Varna.
I think that was at the beginning of the month.
Yeah.
Right.
Get canceled.
Yeah.
COVID happened.
But yeah. happened but yeah so in in the coming months and years i i do expect us to focus more on
leveraging the basic infrastructure capabilities than we have now we can add now some other stuff
but not change the the core of what we have and probably fix some semicolons or
this kind of stuff we're doing in this tennis committee.
On the Microsoft side, we have been making, you know, very good progress.
We were talking about Rost and Microsoft, you know, earlier.
I can tell you that Microsoft has been putting a lot of research just to make sure that we're delivering modules implementation
as fast as we can.
So we're doing this in the MSVC compiler,
which you all know has been around for a very long time.
And it is only very recently that we started having parse trees
to do compiler
work. We used to
be using tokens.
So when I started the implementation
back, you know, the one that presented
STP 2015,
I had to introduce some of the
destructors, but what a compiler was using
was tokens.
So, like,
it's like, I don't know,
drinking coffee with fork.
It can be done, but, you know,
there are better tools to get in there.
So we're getting the better tools into the compiler
and Microsoft has been putting the resources there.
Internally, the compiler itself is using modules.
At least in the modules component, using modules at least only in the modules
component using header units and modules and I will be also shifting
some of my attention to tooling around modules and you know some of it this
will be open source right and the thing that I want to show the community here
is how your code look like organized when you use modules.
So you'll be seeing modular codes from Microsoft,
you know, in the near future.
That sounds very exciting.
Yeah.
So if I can say a little bit more on the modules topic.
So I think every compiler that I know of so far
that has been implementing modules
has to process the module's interface
and then save that in some form
so that it can reuse it without having to parse
and instancy that kind of stuff all the time.
And so every compiler has to do that.
And so far, each compiler has been doing its own thing, right?
It'll be actually interesting if in the developer tools,
similar developers tools community,
we can come into agreement on, oh, can we have similar format?
Or at least can we offer the same API to look at these artifacts
that you produce from compiling modules? And if we can do that, so I'm not saying we are going to do
Java classifiers, but this is the kind of thing that now you can think of is possible. But if you
can do that, just imagine all the other stuff that you can build on top of it. I think the
tools, the code system in Java benefit a lot from having one single format for,
hey, if you compile your Java code, this is what it looks like. If you want to bind C++ to any other language,
today you need something close to a C++ compiler
or you restrict yourself to the poor, impoverished subset of C++
that is easy to parse and handle at runtime.
So you're not actually using the reassured and
safer part of it, right?
C++ brings you tools to
write safe and secure code.
We were talking about safety earlier.
But today,
we don't actually
do those at the interface
level because we have
to, you know, most of C++
programs have to, or libraries,++ programs have to, or libraries components,
have to interface with other components written in other languages. This is what the language is
good at, but we are not using the safest part of the language to do that because we don't have
the tool support for doing that. But imagine that you can describe the interface of your component as a module
interface, and it gets compiled.
Then you have this data structure
that you can manipulate
without having to require
a C++ frontend.
You can go very, very far. You can do fantastic
stuff. So that's what
I'm thinking about, and
I would like to have the
tools ecosystem community you
know coalesce around and push forward so I am writing expect for what's the
Microsoft compiler is is doing today I expect to be able to release that to the public long before the end of the year,
but we're already in June, so I'm just putting pressure on myself.
But I expect to release that very soon and get feedback from the community.
Are you guys interested?
What can you do with this?
If you're interested, are you interested in having other compilers get together?
You don't have to take this, but this is what is possible.
Can we improve it?
Do we want to go in a different direction?
What can we do?
I think that's what I want to do, where I want to go.
And then we can look at what tools we can build to, let's say, bring a scalable version of BorrowChecker, right?
When people talk about ROS, they say, oh, yeah, I have BorrowChecker.
You borrow this reference to that data, and you're the only one doing it,
and it's not going away before you come back and that kind of stuff.
You can do something very similar in C++.
It's a little bit more complicated, but you can do something very similar,
very nicely when you have a good data structure that represents your program.
I know when C++ modules were first voted into C++ 20 like a year ago, we talked a little bit at the time about how this separate paper was being written about module implementations.
Is some of the work you're talking about related to that?
Yes.
Yeah. Yeah. module implementations? Is some of the work you're talking about related to that? Yes. So the express representation
also this is part of work that I've done with B&A
15 years ago. It's on GitHub.
I'm now taking some time to revamp it and
update it and
convince my fellow co-workers at Microsoft to use it more.
And for example, the code analysis team is seriously looking into it,
like using it in future release product.
Just to try to show the community,
this is what we can do to bring greater safety and security to our community, to our users.
Right.
And I'm saying this to emphasize the fact that
Microsoft is not disengaging from the C++ community.
If anything else, it's like it's pouring more resources into it.
Right.
I just want to debunk that notion.
So related to what you've been talking about,
well, a bit ago you mentioned that when you were first trying to do modules
with the Visual Studio compiler that you had a bunch of tokens and you didn't have an AST to deal with.
And longtime listeners of our podcast, we've brought this up several times.
So it's a little off topic, but I am curious, is that work done now?
Is the Visual Studio front end now fully modern?
Well, so that work is not finished yet.
But. Well, so that work is not finished yet.
But.
Okay.
So the but is not, oh, we'll do it.
No, no, no.
It's ongoing.
And there are several components to this.
So when we talk about it, the Microsoft team using parse trees.
So it was not using AST.
It was using parse trees. So there's a distinction between parse3s
and abstract syntax trees.
So abstract syntax tree is like, it's abstract.
So a bunch of stuff have been thrown away, right,
from the concrete syntax.
And parse3 is really a reification
of the grammar production.
So that's what, so when a grammar, you know,
parses your program,
the production that it uses to pass it,
then it uses the structure to represent all of those things.
Okay.
You know you can declare a variable at least in two or three or four ways in C++,
like int x equals zero.
So int x equals 0 with parse trees
will be represented very differently
from int x open curly 0 close curly.
Right.
Right.
Right?
Yeah.
So you have different representation parse trees there.
However, if you look at AST,
essentially you can have one.
You have a variable declaration.
Its name is x, type int initialize is 0.
You're done.
Right.
Right.
So, abstract syntax trees has abstracted a lot from the syntax.
Whereas parse trees faithfully represent what is written exactly in the source code.
So what the MSVC team has done is to introduce parse trees.
So you can think of it as phase one of modernizing the compiler.
Okay.
And what we're doing right now,
and Gore, I think you've had Gore on your show too.
It's been a little while, yeah.
So what Gore is doing now
is to move from the pastries to the IPR,
the internal program presentation,
which is, it's more than AST.
So we are jumping the AST thing.
Like, instead of going to AST,
we're just jumping AST to go to what I call
the abstract syntax graph.
When you have a C++ program,
you don't get a tree, you get a graph.
Just imagine you have a class,
class A, and you have a comparison operator it takes two a's and and
returns a bool or member function equal equal takes an a so if you want to present that class
a you have the class a and then inside that class you have a member function operator equal equal
that takes an argument of type a so you have a reference back to the class that you're defining, right? So you have
exactly the structures in terms of semantics. Okay. So you actually have a graph, you don't
have a tree. So that's what the team is doing now. It's going from the parse trees, the concrete
representation, the source code directly to the semantics graph, which is actually
what you want when you're doing semantics-based analysis.
Okay.
Whether you're doing code gen or you want to enforce memory safety, type safety, type
checking, you're actually using the graph.
You're not using the tree.
You're actually using the graph.
You want to be able to say, oh, it is the same class, the same class.
Sameness is somewhere you cycle back to where you started at some point, right?
It's not a tree.
That sounds like it.
So that's just what the Gory is doing right now,
and the code analysis team is leveraging his work.
And that abstract semantic graph is the data structure that we save in the module,
compile module artifact, what I call IFC.
Okay.
So the documentation that I'm writing is to document.
This is the abstract semantics graph that you get from compiling a C++ program.
And then you can go and walk that and do wonderful stuff.
I don't know whether you were able to watch the talk
that Cameron gave on module tooling.
There was the virtual C++ conference a month ago organized by Sai.
And Cameron, that camera,
gave a talk on modules tooling.
And what he presented,
the tools that he presented,
was based on the IFC.
And it gives you,
it's an illustration of the tiny tip of the iceberg
of tooling in the 21st century for C++.
We should try to put a link to that in the show notes.
Yeah.
I watched some of the virtual C++ talks.
I don't think I watched that one, but I will find a link for it.
I will send you a link to that.
And it's a wonderful thing.
And it's like, you know, 30 or 40 minutes talk, but it's really great.
Awesome.
It sounds like a lot of work is going on and that it's going to enable
more
code analysis, more warnings,
more static analysis and all that.
Absolutely, yes. And so, yeah, the code analysis team
is also in and about analyzing
its infrastructure.
I think some of the plan that
things that we like is
ability to
use the Microsoft compiler as just, you know,
not a plugin, but, you know, a tooling infrastructure.
Like, you know, I like to talk about this idea of MSVC SDK.
So, you know, that's the North Star that the team is working around.
And I'm very happy for that work that we're doing.
So the compiler would be just one more tool
that happens to use this graph output.
Exactly.
Fascinating.
Okay.
And you can also imagine,
so this abstract semantics graph,
I invented for handling module interface, but it has
to represent entire C++.
Now, you can just imagine you compile your code directly from C++ to that semantics graph,
and then you delay code generation.
You don't attain platform independence there because C++ is C++. No size of double is whatever the platform target says.
And then as long as you do a lot of SFINAE stuff based
on size of, a bunch of other things,
some decisions are going to be made and baked
into that representation.
But you get very, very close to that and and so this idea is not new you know
some of the the implementation techniques that clang is using is based on this right and and
some of the discussion we had during the module standardization was that apple didn't want uh
sorry apple wanted the ability to compile C++
to an intermediate representation.
That is not code gen.
And then later on, generate code from there.
So we needed to amend or modify some 30 years old rules
in the standard to make that possible.
And they have to do with things that we used to do
because we didn't have modules
like static inlines and
other stuff.
And none of this work stopped last
week when that article came out.
Not
to my knowledge.
Just making sure.
Yeah.
I think my management chain
hasn't given me any indication that we should all stop and jump on Rust.
Okay.
Just making sure.
Yeah.
Like I said, if anything else, Microsoft really, really is doubling down on C++.
It's the core language that the company supports and that promotes.
And that promotion is not stopping
now. It's for
at least until
some huge event
happens.
Want to interrupt the discussion for just a moment
to bring you a word from our sponsors.
Clang Power Tools is the open source Visual
Studio extension on a mission to bring LVM
tools like Clang++, ClangTidy
and ClangFormat
to C++ developers on Windows.
Increase your productivity and modernize your C++ code with automatic code transformations
powered by ClangTidy.
Find subtle latent bugs by using LLVM Static Analyzer and CPP Core Guidelines checks.
Don't fiddle with command line switches or learn hundreds of compiler flags.
Clang Power Tools comes with a powerful user interface to help you manage the complexity
directly from the comfort of your IDE. Never miss a potential issue by integrating Clang
Power Tools into your workflows with a configurable PowerShell script for CI CD automation.
Start experimenting on your code today. Check it out at clangpowertools.com.
Okay, so we've talked a lot about modules, but you said when we were preparing for the
interview that you've also been starting to focus a lot on concepts.
So what's your work been on concepts?
Oh, okay.
So concepts also has been implemented in MSVC by Shang Fan, another principal on the team.
And so now I can write code with concept.
And I can tell you that the front-end team
is dogfooding concept in the compiler.
So the MSVC compiler is full of concept.
So I started using these, you know, concept apps.
They were coming online in the modules component.
And then Cameron said,
hey, we should be using this more often and
then he just you know run away with it and uh I don't know what happened uh in the last couple
of weeks I've seen a lot of bug fixes coming from uh John Caves and I'm like hey these are all
concept related and it sounds like we haven't caught them in the front end.
And, oh, yeah, it's because the back end is going full on concept.
So the MSV compiler itself is using a lot more of this concept.
And the things that we're realizing is when you use them instead of Sphene,
they code, it's just simple wonderful right you know i think some
of the uh patches that i've seen that i love is when a lot of right like deletion like a lot of
things have been you know removed by just in a replace with just a couple of lines it makes the
code much simpler more robust and you know so that's what has been going on.
And as I've been doing that and helping the rest of the team doing that,
like, what lessons can I take from this
and then tell the rest of the community?
So my, you know, I gave a one-shot talk
at the virtual C++ conference
that we were talking about earlier.
And it was like, no, what I showed there
was based on my experience trying to clean up
stuff that we did that used to be there
in the MSVC compiler.
And so it is not a technique that is unique
to the compiler.
It is broadly applicable.
And I will send you a link so you can watch it.
It's really useful.
So I guess I will have to talk to the C++ Core Guideline
editors to update the guidelines to include concepts
to show how you can write wonderful, robust, safe C++ with concepts.
And this talk I gave actually specifically shows
how you move runtime checks to compile time assertions.
That guarantees statically something that you really wanted to do
so you can perform you have a more performant code at runtime so that's what it's all about
okay yeah to me you love that yeah of course yeah and if i hear less code, I think less opportunities for bugs also.
Oh, yes, absolutely.
Yes, yes.
The bugs have to hide somewhere, right? So less code, you have less opportunity to give bugs to hide.
Yes, absolutely.
Now, often when we hear a concept, well, sorry for the overload of the term here,
but when someone says something like Sphene on the podcast,
I would like to say, hey, could you explain that quickly for our listeners?
But I don't know if it's possible to quickly explain Sphine for our listeners.
But if you think that you could give them an illustration
that would work over the air here for how or why concepts is able to remove a bunch of code,
that would be awesome.
Okay, so I'm not going,
even though some of these codes are Sphene,
I'm not going to explain Sphene
because I don't actually want people to know about Sphene.
Okay.
If you know about Sphene, you have already gone too far.
Please come back.
Right.
So the basic notion about concept is that
it is just a predicate over things that you manipulate at compile time so those
things could be a type could be an integer value or it could be an address
of a function that there's a to a function. Those things exist at
compile time, right? And since we have had cost expert, now we can have more values, not just
integers, but structure value, you know, values with structure at the compartment of class times,
right? So in the codes that I showed in my talk, what happens is that you have a table. Usually you have this situation where you have
a table. You want to map, let's say, enums to some actions or to some names and that kind of stuff,
right? And you want to ensure that the table, you have enough, you have the same number of element
entries in the tables as you have an enumeration. And then you go by hand, do this kind of mapping,
and then you write some assertion,
make sure that when someone is trying to index into this table,
the index is not out of bound, right?
You want memory safety, right?
So when you look at it, like, hey,
it looks like there's some kind of one-to-one,
abstract one-to-one correspondence
between the um the index the valid index into this table and the numerical value and it can
make that it is a static structure so you can actually express that at compile time with
constexpr functions right now you put that since now it's a compile time function you can turn it into a
predicate remember a concept is just a predicate so you express that predicate as a concept now
you write your generic function that takes a type t that satisfies that concept you know possibly
the table that you want well remember a remember, a function, a function template
is instantiated only when it is used.
And with concept, it has to be used,
and the predicate has to be valid,
which means that whatever invariant
that you wanted to capture, you capture it,
and that invariant is satisfied.
Then you can get into the body of a function
and execute your code so you have
moved the runtime complexity you know go through the table and make sure that everything is okay
just move that at compile time the compiler has access to that invariant now now it can just go
and directly index into into the table so you have less code to run at runtime,
but most of the checks has been done at compile time.
Because the runtime code is small,
it actually means that you wrote less, not more.
Right.
Right.
And also the expression of the concept is very, very simple.
So an example is the video,
and I don't know how to talk about it,
but the basic line is that you have
some static invariants that you can capture as concepts because there's predicate over
compartmentalized values.
I think of types as values.
And then you parameterize your function based on that predicate.
And boom, you just now focus on the real work to get done.
There's no possibility of missing if you capture your invariance properly.
Definitely have to check that out.
Yeah, and I was actually preparing a talk for CppCon.
If we get it going, and if the talk is accepted,
where I wanted to show some techniques that I used
based on concepts to describe this abstract semantics graph
that we were talking about earlier.
Okay.
Yeah, so again, what I did there was just to capture
the algebraic structure, the static structure,
and turn it into concept.
And then the rest of the code just flowed.
And Anne had to describe this to Gianni.
I think his answer was, you're having a good day.
So I think it must be so i
think it must be good nice that's awesome i'm actually if you don't mind uh interested in going
back to modules just for a minute because we've seen a lot of articles and discussion about modules
and it sounds like you at microsoft are actually putting them to use so i'm curious what um if you've seen uh like any misconceptions about modules that you
would like to dispel while you have the opportunity to or any best practices about using modules that
you would like to share yeah so you know that's a very good question and right on topic.
So I've been thinking a lot about programming with modules.
My CVP Khan talk in 2019 in Colorado was actually specific on that topic.
Like, if you're going to use modules, these are the best practices.
And before that, I gave a talk also on good practice with modules in Paris in June 2019.
And I'm also working on, you know, ongoing books now to try to present.
Yeah, you know, I only got 24 hours in a day.
I heard everybody got the same.
So it is hard. But, you know, language feature is of no good use if we don't know how to use it properly.
Right.
So I have to do that kind of work.
Anybody who wants to help me is most welcome.
Yes, I've seen a lot of articles on the web and the ones that I would say infuriates me but close is when they're
purely clinical description that are based on what the standards wordings are like I don't know
it's not already have what I got four or five types of module unit I don't even know how many
we have I only get to know that when I have to review compiler code. Otherwise,
I don't know. From my perspective, from practice perspective, you get an interface and you get
implementation. An interface has distinguished that we just have header files. If you use the
MSVC compiler, you dot IXX files. That is where you just specify, here is my component. Here are
the things that you can call directly.
And the rest is my business.
So my business, I'm moving it into.cpp file.
Done.
Like what you're doing today with.header files and.cpp files,
you can continue doing that,
except you have a tool that allows you to actually enforce
the boundaries of your component.
And you get to directly express your dependencies so we know what is going on and as I was
indicating earlier we're going to have a bunch of other tools that's going to use
this information like through the entire tool chain to help you being productive
that's all you know you need to know there. Don't spend too much time like,
oh, we have module partition you need.
Sorry, I can't responsibly tell people
go and learn all these four types of people.
But as a tool builder,
I have to support those scenarios for completeness.
But from a practice perspective,
just make your code very simple.
What else did I want to say?
Oh, yeah, macro.
Please, please.
I know we love macro.
I know they allow you,
they are very flexible, right?
You can manipulate,
they are malleable, right?
But that is exactly why it's very hard
to write solid components
when you make too much use of them.
So think about expressing your abstraction
in terms of classes, functions, and concepts,
and not textual replication.
Capture the abstraction and name them with something other than a macro name.
So that's what I would say.
What else?
Yeah, concepts,
concepts for functions and modules.
Yeah.
I wish we had contract,
but that would have helped a lot
in terms of the,
we were talking about safety,
increased safety,
that would have helped a lot with code analysis.
There were a lot of controversy at the end
and some misleading descriptions in the public.
I just hope that we get to be doing it better this time around.
Like, back in 2018, Rapperswilts,
we got an agreed upon minimal facility that everybody said at the time they can live with.
Like everybody who was involved.
Nobody got exactly what they wanted.
But we got something that was basic enough that we can work with.
Now, Microsoft is very much heading on to code analysis,
static analysis.
And it's not just static analysis.
I say code analysis
because you have analysis
that happens pre-compilation
and analysis that happens post-compilation.
So post-compilation,
Microsoft runs tools
that after banners are produced,
it goes and analyzes them for various reasons.
And then you also get runtime analysis things going.
So it is really about code analysis.
And what contracts are?
Contracts are just expressions of expectations
and guarantees that a facility, an abstraction function,
say it's like, I expect that the arguments
that you're passing in verify this predicate.
Again, you see a theme there?
Predicate, predicate, predicate.
And I guarantee that when I can complete, when I complete, I'm making this guarantee back to you.
Like when I return to you, I guarantee that the states of the program will be soft that this predicate holds.
And that is powerful.
And we got that facility.
People who just want to have runtime verification, they can have that. And people who want to be able to express these guarantees through contract,
like express, you know, use them in code analysis can also do that.
Right after Rapperswil, some of the people who were party to the agreement
that they all said they could leave it just decided to rewrite the entire facility,
which was not acceptable. So,
what happened in
Cologne was that there were
a slew of papers that
were put to vote, and through
you know, it's death
by a hundred, you know, thousand cuts.
Through
a slew of these little
papers, the
facility was gutted, of course some of us
were unhappy with that and i think nico nico's reduces well if there's so much controversy we
should just yank this thing out because clearly what is there no longer has the support that it
had in rappers fear so it was taken out uh we all agree it should be taken out, and that's what happened, unfortunately.
My hope is that the new study group that was formed after Cologne is now looking at contracts again
or come with a better consensus than we had before. It is a shame that we have to delay this while the problems with
safety are so much current, right? But we'll see. It's a lost opportunity, but hopefully
with the learning, with the infrastructure that we have with modules,
we can convince more people.
This is what you get additionally if you make it easier to do code analysis.
Right.
Okay.
Well, it's been great having you on the show again today, Gabby.
I look forward to seeing some of the stuff you are working on with modules come to light.
Thank you very much for having me on. It's really wonderful talking to you. Thanks again. And stay safe.
You too. Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast. Please let us know if we're discussing
the stuff you're interested in, or if you have a suggestion for a topic, we'd love to hear about
that too. You can email all your thoughts to feedback at cppcast.com. Thank you. thank all our patrons who help support the show through Patreon. If you'd like to support us on Patreon, you can do so at patreon.com
slash cppcast.
And of course, you can find all that info
and the show notes on the podcast website
at cppcast.com
Theme music for
this episode was provided by podcastthemes.com