C++ Club - Carbon, WG21, ABI, Books, Build, Packages
Episode Date: August 28, 2022Notes: https://cppclub.uk/meetings/2022/153/Video: https://youtu.be/2u6fE5v8LBw...
Transcript
Discussion (0)
Welcome to the C++ Club. This is meeting 153 that took place on the 25th of August 2022.
The August mailing is in. Not too many papers, some of them at version 0.
Some of the interesting ones were this one, static vector.
This is a dynamically resizable vector with fixed capacity and embedded storage
for those situations where you you can't allocate memory at runtime, for example,
in embedded situations.
Quote, this paper proposes a modernized boost static vector that is a dynamically resizable
vector with compile time fixed capacity and contiguous embedded storage, in which the
elements are stored within the vector object itself.
This is for the situations where memory allocation is not possible, like embedded environments without a freeze store where only
a stack and the static memory segment are available, or you can't accept performance
penalty of dynamic allocations due to latency for example. In gaming, like if you have a game that works at 60 fps then you have like 16 milliseconds to render a frame.
This is also for those situations where std array is not an option,
because you want to store non-default constructible objects in it, or you need a dynamically resizable
array within constexpr functions, and when you
need your objects to be stored within the actual vector itself. There are
existing implementations in Boost, but this brings this feature to the standard.
This is a revision 5, so I reckon it has good
chances of getting into the next C++ version. Next one is usability
improvements for std thread. This paper proposes a way to set a thread stack
size and name before the start of its execution,
both of which are current practices in many domains.
Neither thread nor JThread in the standard have this feature.
Names for threads are useful for debugging, I guess.
In Windows, for example, many system objects
of this level have names.
So yeah, this would be a useful feature.
But we do have our IDs, right?
Thread IDs, which can also be used.
I mean, what thread name adds the benefit?
The immediate benefit that I can think of is when you have a thread display in the debugger
and want to see what's what. You don't always want to look at thread IDs.
If you have a name, you immediately see see it maybe there are some other use cases
so they say most operating systems including real-time operating systems for embedded
platforms provide a way to name threads and names are usually stored in the kernel to manage thread or tasks. The name can be used by
debuggers, by crash dump and trace reporting tools, system tasks and process
monitors, and other profiling and diagnostic tools. Oh yeah, when you are
profiling, like doing performance profile, It's useful to see the name of the thread, which
is interesting to you.
Right, next one is
Enhancing the break statement. This is a weird one.
This proposes to enhance the break statement with an optional termination
statement that is executed outside the scope of the broken loop. The proposed
syntax is break followed by statement. A statement may be a null statement
as in a classic break statement, or some other statement. Like in this example, if you have two nested for loops,
well, one for loop nested in another,
within the inner loop, if you call break continue, for example,
so break applies to the inner loop and continue applies to the outer loop.
Oh.
That's totally not confusing.
We'll see how it fares.
It's at region zero, so yeah.
And this would have been managed previously as well, right?
A bit simpler way.
I mean, that's okay.
Well, they list two alternatives
where you sort of have a flag that allows you to decide in the outer loop
whether to continue or not and you set that flag in the inner loop granted the after version is
shorter i don't know maybe you will get used to it if it passes. Yeah.
It's very contradictory words beside each other.
Break and then continue.
I expect people, if this passes,
people will want to add additional keywords to existing keywords.
Or maybe even more. If you have a double nested loop,
you could add break continue break or something
that'll be great
the next paper i wanted to look at that was i think from an earlier mailing but i kind of missed
it is language support for customizable functions. This is at version 1,
and it's proposed by Lewis Baker,
Karantan Jabo, and Gaspare Rajman. I think we looked at this before,
and this is about being able to declare
customizable functions at namespace level.
So imagine a virtual function without a class.
A virtual function without a class?
Yeah. In this version they call them customizable functions with the appropriate modifier after
after the function declaration.
And you would have a default implementation if you wanted and you would also be able to override
with the same override keyword that's used currently for virtual functions only this
would be for customizable functions i think they wanted to name them virtual previously, but decided against it to avoid too much confusion.
Oh yeah, and this revision 1, quote,
New syntax to introduce customizable functions using a contextual keyword, customizable,
rather than virtual and equals zero.
This is to address initial feedback that reusing virtual for a feature unrelated to dynamic polymorphism was at best confusing.
Yeah, that's probably for the best.
Have we seen any use case? Previously we used to specialize or overload the functions.
This is to define customization at namespace scope and
previously people used like ADL based or TagInvoke based customization
solutions for that but they are cumbersome and ADL... anything ADL is even more confusing.
With this you would be able to customize function templates,
have generic customizations with auto in the template parameter list.
Yeah, well, func is just a name here. Yeah.
Not anything.
So this is an example of the previous TagitVoke based customizations where you wanted to customize
something at the namespace level instead of the class level.
And this will be much simpler with the customizable functions shorter at least yeah
i noticed that the keyword customizable uses british spelling with us i don't know how that is, if that's common in American English. But I'm sure it'll cause some bike shedding in a
meeting or two. I remember when I was back at Symbian, we had a rule. Most of our developers were Brits but like UK based
but we had a rule that
anything source code
uses American spelling
just to be consistent
and I've seen that
elsewhere also
so we
don't tend to use
British spelling in the source code
Have you participated in these
discussions i mean where we finalize the papers to publish in the next release
you mean in the committee i'm not a member of yeah no oh you are not okay no i'm just i just
like to comment from the sidelines okay okay right so
next item is Jason Turner
posted a tweet
with some positive
C++ information
quote theory
every aspect of modern technology hinges
on C++
evidence, javascript engines
browsers, windows KDE, JVM all written in C++. Evidence. JavaScript engines, browsers, Windows, KDE, JVM, all written in C++. Python is C,
but all major libraries are written in C++. GTK on Linux? Your C compiler is written in C++.
Also, Adobe Tool Suite, OpenOffice, and Microsoft Office are written in C++.
Rust programmer?
You rely on NLVM, which is written in C++.
If we want to go even deeper, it's no secret that the main lithography machines used to actually create our CPUs run software written in, you guessed it, C++.
Many embedded developers are moving to C++.
If you use an IDE, it's either written in C++
or relies on software written in C++.
Once GCC moves to C++,
I think C++ became unavoidable in your ecosystem.
Long future for us.
Yeah, so that's a little dose of C++ optimism for you
in case you need it after the previous meeting.
Yeah, after Carbon, yeah, I remember.
I can see how some people became depressed.
Like if you just started learning C++
and you see something like that
and all the hot takes that
C++ is dead and
everything should use carbon
now and we need
10 carbon
developers with 10 year experience.
It's not
no surprise. I think you showed when
advertised they're looking for carbon
developer with 10 or some years of experience.
Yep.
So, on to carbon reactions now.
The podcast ADSP, Algorithms plus Data Structures equals Programs, plus data structures equals programs. Bryce Lelbock and Connor Hextra talked about carbon,
but not much, actually.
The main snippet of useful information from this episode
was that the current carbon core team,
Chandler Carruth, Kate Gregory and Richard Smith,
are not benevolent dictators for life, as some people feared or hoped.
There will be rotation.
They won't stay at this post forever.
And that was pretty much it for the useful information in this episode.
Oh, that's it? Okay.
There was an article that I read,
which is called
C++ Syntax Sucks and Carbon Fixes It.
This is by Erik Enheim,
a C++ developer from Norway.
He posted this article on his Medium blog.
And as you can see here, Medium introduces a paywall for your articles,
even if you intend them to be free.
Some people without paid accounts won't be able to read them, so don't use Medium.
A better title for this article I thought would be
C++ syntax is hard to parse and Carbon syntax is a bit simpler.
But of course it's not as click-baity as this one.
The author quotes Jeffrey Hyken,
the author of VS Code C++ Highlighter plugin,
who says,
the C++ syntax highlighter at 19,000 lines is not only the largest of any language,
but nearly four times larger than the
second largest syntax which is typescript at five thousand lines the author writes quote the problem
with c slash c++ syntax is that you cannot determine what a statement is until you have passed several tokens." Ah, the famous programming language C slash C++.
He mentions the most vexing pass, where you can't easily distinguish between a function
and a variable declaration in C++. For example, if you have int bar parentheses int parentheses
x closed parentheses, it could be either a variable declaration or a function declaration.
True. Which depends on context.
And apparently it's not an issue in Carbon because Carbon classes
don't have constructors and objects are initialized using assignment.
Eric writes, one of the most annoying things when writing C++ code is dealing with const
correctness and references. He likes the fact that in
Carbon the compiler takes care of choosing the best way to pass parameters
to a function, so the programmer doesn't have to think about it. I have a
suspicion that we'll end up having to know what the compiler chooses to
represent parameters and thus go back to the C++ situation,
only even more confusing. If you remember the discussion around C++ concepts before they were
accepted in C++20, where we didn't get the short concept declaration syntax
just because people wanted to know what's behind the name.
And suddenly in Carbon, everyone is fine with it.
Cool.
So Eric really likes introducer keywords in Carbon
that introduce variables or functions.
He says, quote,
This makes scanning a list of functions or methods quicker in Carbon.
Especially languages such as Java are terrible in this regard.
You read public, static, void,
and then you finally get to the important part, which is the method name.
C++ is better, but still takes focus away from method names,
by having info about the return values come before the method name.
He says nothing about the modern function declaration syntax in C++ using water, which I'm sure he knows about,
but chooses to omit because it's inconvenient for the point he's making.
To summarize the article, Carbon doesn't improve C++ syntax but chooses to be different,
and that's okay. Yes, syntax of C++ is hard to parse. But on the other hand,
Carbon barely exists at the moment, to the point that Eric had to make assumptions on how it may
look based on Rust and Go. And Carbon doesn't even have a proper compiler yet. So we'll see
how Carbon syntax develops. Is anyone is using Carbon outside of Google right now?
I hope not.
Maybe some experiments, but not more than that.
It's too risky for anyone to do anything.
And we don't have any IDE or any compiler,
as you mentioned, in the market right now.
I'm not sure about the compiler.
They might have something internally.
I think there is support for a language server or something,
and there is a plugin for VS Code that supports Carbon.
But I'm not sure how this is compiled or how this works.
So the Reddit thread is here here and here are some quotes.
They just wanted the parser to be easier. Most of the syntax is worse for users.
Agreed. Maybe it solved some fringe cases but for 90% of the usage, it looks like the syntax has more noise and is harder to read.
Someone repeated my previous point.
As of C++11, there is the auto-keyword, alternative function syntax, and aggregate initializers.
When you use those, it makes the syntax look a lot like Carbon.
I've seen people actually ask, should I learn C++ still?
Or is it inevitably dying because of Carbon?
And it's just the silliest thing to ever suggest.
Someone says,
And another quote.
Given that Carbon was started because by all appearances
Google lost a single vote in the C++ community
and decided to not try again or convince people and reach consensus,
but rather hard fork,
what will they do in the future
if the Carbon community votes against them on some issue?
Or is the intention
that this is a forever google dominated project?
questions
a related reddit post is about abi our favorite topic as of late. The post asks, is it too late to break ABI? Quote, has the ship sailed? Is C++ doomed?
Or do we have a magic solution in the horizon? A few years ago, C++ voted to put performance and
ease of use as second priority over breaking ABI. Google stopped its contributions to C++ and Clang. Today they announced Carbon.
Do we even intend C++ to keep evolving? Or we intend for C++ to follow the C path and prevent
evolution in favor of not breaking anything forever? Will an ABI break ever happen? End quote.
And the question like this on Reddit usually results in a thread full of runs and venting.
Here is my summary of the points raised in the thread.
The committee is useless.
Simple answer.
There's usually one of that kind in every thread like this.
Caring so much about ABI costs the committee contributions of Google employees.
Those still using C++98 and old compilers should be disregarded.
Old code doesn't deserve new compilers.
Everybody should compile everything every time from source, just like Google.
The embedded domain also exists, and compiler upgrades story there is painful.
People are still upset at the C++11 string copy-on-write ABI break.
People are too obsessed about breaking ABI. It won't solve C++'s problems.
Backwards compatibility is C++'s strength, not weakness.
You can just implement your own types for performance.
Epochs would have been great, but are unrelated to ABI.
As we discussed before, epochs are pretty much unimplementable and would not fix things but instead are likely to
fragment C++. Another point from this thread the standard committee hasn't
banned ABI breaks. What they did, however, ruled out introducing changes that mandate ABI breaks for now.
So there won't be a forced ABI break.
But ABI breaks may happen in the future.
And the usual points from a thread like this.
C++ sucks. And on the contrary, C++ is great and is not going away.
So that's the summary of the thread for you.
There is a book that was released by Google, Titus Winters, Tom Mantrek and Hiram Wright.
The book is called Software Engineering at Google.
It was released by O'Reilly in 2020 and is now available online.
If you have an account with them or you can read it for free on the web
and you see here on the screen.
Quote, it's about the engineering practices utilized at Google to make
their codebase sustainable and healthy. End quote. So this book should be useful
even if you aren't trying to get a job at Google. Reading it online could be
tricky though due to Google's choice of an extremely thin font.
They probably thought that software engineering at Google is so difficult that even reading about it should be as hard as possible.
You can always use your browser's reader view though.
Another good book, it's a classic actually, Elements of Programming by Alexander Stepanov
and Paul McJones.
They made their book freely downloadable as a PDF because it's now out of print and the
publisher reverted the rights to the authors. So that's what they did.
Published a free PDF.
Right, this is a tech note
by Konstantin Burlachenko
from C++98 to C++2020.
It's pretty long.
There's also a PDF available in the same repository,
and it outlines changes to various aspects of C++ from C++98 to C++20,
which can serve as a refresher for C++ developers especially not familiar with modern C++. Curiously, he included
the following explanation. C slash C++. By C slash C++ we mean C or C++
programming language. Still seeing C slash C++ throughout the document is a
bit... ehh.
Hacking C++ posted a tweet about new string functions in C++20 and C++23. It's a small cheat sheet with a nice shout out to the Expanse book series and Amazon TV show.
Very good. If you like hard sci-fi, I advise you to read the books and watch the show.
Tachi was a spaceship name from that show. So the new functions are startsWith and endsWith
for C++20 and contains for C++23. There is also resize and overwrite, which obtains a new buffer, copies old content, and basically
resizes the new buffer if necessary.
All useful functions, but other languages are giggling uncontrollably.
They had this for like forever. An interesting thing about sanitizers that I didn't know about
Daniel Lemire wrote on his blog catching sanitizer errors programmatically. To be
fair this is for address sanitizer only. Apparently both GCC and LLVM sanitizers call a function
__ASAN__ON__ERROR when an error is
encountered during an instrumented run. So if you provide an implementation of that function in your program that is
instrumented with ASAN and an ASAN error is detected at runtime, your function
gets called and you can write that error in a log, which is really useful because otherwise you would have to redirect the console or just
watch the console for any errors but this way you can make it persistent
i don't know if any other sanitizers have this right next item is raymond Raymond Chen wrote on his blog an article called
How can I synthesize a C++20 three-way comparison from two-way comparisons?
In this article he discusses the new C++20 spaceship operator.
He presents a useful table that lists comparison outcomes of a spaceship operator, he presents a useful table that lists comparison
outcomes of a spaceship operator, of which there are five, which is a bit
weird because the operator is called three-way comparison operator, but there
are five outcomes. But if you're a C++ programmer, you shouldn't be surprised.
And this goes like this.
For strong ordering, the result can be less, equal, equivalent, or greater.
For weak ordering, the result can be less, equivalent, or greater.
And for partial ordering, the result can be less, equivalent, greater, or unordered.
What's unordered? Presumably when the order is unspecified. It's just a potential
outcome. I'm not sure what partial ordering is. Well, that's part of the algebra, where not all elements have a relationship, so like
a coin from a coin, so let's integrate them. Algebras can be quite weird.
Right. Thank you.
Regarding the difference between equal and equivalent,
Raymond says,
For example, two instances of the same string hello are equal
in that they represent the same string and are fully interchangeable.
On the other hand, two people with the same security
clearance are equivalent from a security perspective. They have access to the same things,
but they are not equal. They are nevertheless different people." Raymond states,
"...the strong ordering distinguishes between items being equal, identical and interchangeable,
and equivalent, not interchangeable, but close enough for some purpose.
Interestingly, in the Reddit thread, Stefan T. Loverwaid of Microsoft says this is incorrect.
The standard mandates that equal ordering is strictly the same as
equivalent ordering.
Raymond continues, quote, Suppose you have an object from a class library that predates
C++20 and doesn't support three-way comparison.
You want your code to be able to take advantage of the three-way
comparison should the library be updated, but fall back to two-way comparison in the meantime.
In other words, you want to take advantage of three-way comparison if available. His advice
is to use std tuple for comparing objects that don't support a quality check and only support
operator less than. Quote, tuples have the bonus property of supporting the three-way comparison
operator even if the underlying types do not. In the case where they do not, they will synthesize
a three-way comparison from the two-way comparisons. And he presents an example where a comparison function
for a custom type uses tuples and the spaceship operator.
Note that if the type being compared supports operator ==, it will be more efficient
to use std compare weak order fallback, as it checks for equality first, which is more
efficient, and only then falls back to operator less than.
Interestingly in the thread, Stefan T. Lovewade also says,
in order to use the Spaceship operator, it is necessary to include the Compare header,
which is a bit weird, as now we have an operator, a language feature,
that depends on a standard library header.
It's the same as if you want to use type ID, you need type info header.
Next item is a Visual Studio plugin.
It's a Visual Studio extension and standalone app for build times and compilation data visualization.
It's called compile score. When you have it enabled it shows
you compile score or how expensive a particular line is for the compiler in
line in your source code and it looks very useful to me. I wonder if it's just
for includes or everything like structures, templates.
I might have to try it but it does look useful.
It uses either MSVC. The modern MSVC version has a facility to provide statistics about the build times.
Or if you use Clang it can use Clang's F-time trace switch for the same purpose.
Right, next item is a short piece of news about the new Clang 16.
It's about debugging coroutine facilities.
If you're lucky enough to be able to use a modern version of Clang and are working with
coroutines, these features could be very helpful.
Quote from the introduction.
For performance and other architectural reasons, the C++ coroutines feature in the Clang compiler is implemented
in two parts of the compiler.
Semantic analysis is performed in Clang, and coroutine construction and optimization takes
place in the LLVM middle end.
However, this design forces us to generate insufficient debugging information.
Typically, the compiler generates debug information
in the Clang frontend, as debug information is highly language-specific. However, this is not
possible for coroutine frames, because the frames are constructed in the LLVM middle end.
To mitigate this problem, the LLVM middle end attempts to generate some debug information,
which is unfortunately incomplete,
since much of the language-specific information is missing in the middle end.
And this document describes how to use this debug information to better debug coroutines. So
the weird nature of C++ 20 coroutines strikes again. I wasn't even aware that this separation of
processing exists for coroutines. In the debugger apparently you can do things like print promise
types, print coroutine frames, which should be helpful when debugging. Speaking of coroutines, Lewis Baker tweeted recently about
implementing coroutine-based code. As far as I know, he wrote a bulk of coroutine
code and the helper library CppCoro, which provides a bunch of useful
coroutine-based abstractions that are easier to use than bare coroutines.
He tweeted a link to Godbolt containing a heavily commented skeleton code for writing
coroutine-based code. Steve Downey took the code and put it in a buildable CMake project for convenience. The code is pretty long, so if you only want to
use pre-built coroutine wrapper classes you can use Lewis's CppCoro library, but
if you need to implement coroutine support from scratch for your own types,
then this boilerplate code may come very handy.
And speaking of skeleton projects,
John McFarlane created a skeleton project for CMake.
It's a Wordle puzzle solver,
but that's not its main purpose. Its main purpose is to introduce
a template for your own projects that use CMake and C++20. Other useful things
that are set up for you by this template are toolchains clang, GCC and MSVC. Package managers, Conan and VC package.
Collaboration and CI, GitHub Actions and GitLab.
Analyzers, Address Sanitizer, CppCheck, Clang Static Analyzer, Clang Tidy, Google Coverage,
GraphViz. This I didn't realize that it would be useful. Graphviz is used here
to represent package and target dependencies as a graph. It's a very nice idea. Include what you
use tool to analyze includes libfuzzer, coverage-guided fuzz testing, pre-commit linting framework with formatting and correctness checks for
bash, C++, CMake, JSON, Markdown, Python and YAML,
undefined behavior sanitizer and Valgrind.
So seeing how notoriously difficult CMake can be
when you are setting up a project from scratch,
this seems like a very useful template.
Speaking of large project management,
someone asked for suggestions on Reddit
about roadmap for advancing c++ large project package
management in finance industry so main pain point is package management and build system
people suggested cmake obviously although it's not really a package manager, but it does have...
the modern CMake does have some facilities to point to a particular package on GitHub, for example,
and download it or use other package managers like Conan or VC package. Another package manager and build system suggested in the thread was XMake, which is perfect
for personal projects, still being actively developed, but as it stands now, it's pretty
problem-free and I'm using it personally for my toy projects. Other build system suggested was Build2,
which seems like an up-and-coming build system
that supports things like modules, etc.
And the usual package managers, Conan and VC package.
Another similar discussion from Reddit. Someone asked Bazelot CMake? Which one should I choose for a new project with lots of
dependencies? Very long thread. Someone said, paraphrasing Churchill, CMaker is the worst software development
build platform except when compared to all the others. Nice, but no. There are nicer
systems. And apparently, according to many people in the thread, Bazel is one such system. Someone doesn't like Bazel, quote,
our team have a medium-sized C++ project that uses Bazel. Managing dependencies is painful,
as nearly all third-party C++ libraries use CMake. Actually, our codebase were originally using CMake,
but as a high-rank manager who came from Google that kept saying how great Bazel were originally using CMake, but as a high-rank manager who came from Google that
kept saying how great Bazel is, and finally persuaded our tech lead to switch to Bazel.
By the way, the ex-Googler left our company a year ago, and Bazel sticks with us.
End quote.
On the other hand, people say in the thread, managing dependencies in Bazel has been a
lot easier than CMake.
And also, Bazel is a better and more powerful build system overall,
is more hermetic, has stronger support for accurate caching and parallel execution,
including remote caching and remote execution,
and caching and parallel remote execution of tests.
There are other quotes in favor of Bazel but people also suggested other build
systems like Xmake, Misson and even Premake which apparently is out of alpha
or beta now and can be used in production. Who knew?
Okay, Boost version 180
was released recently.
And
there were no new libraries,
but there were
many updated libraries, and
someone in the reddit thread was pleasantly surprised by
the library called GIL or Jill. Gif is pronounced as Gif, I'm sure. So this must be Gil. This is a library for graphic processing and I didn't even realize boost had one
seems pretty capable lots of changes in ASIO file system and other libraries like JSON, leaf, log and so on.
Multi-precision library marks C++11 as deprecated and C++14 will be the default.
Lots of nice things in Boost. and boost. Next is a nice and interesting project on github which is an engine simulator. It's a nice
showcase project for C++. It's a combustion engine simulator that generates realistic audio.
You can add various parameters like how many cylinders there
are and various other things and it will model that engine for you and generate a
realistic sound depending on RPMs and car speed and whatnot. It's really very interesting. It has various key controls, like toggle ignition,
hold for starter, change speed, up gear, down gear, and yeah,
simulation time warp.
Does it run the CPU on all cylinders?
Someone said in the Reddit thread that, oh, I now have like Ferrari V8 engine.
Sadly, only virtual. I think that's it for CVS topics and now for
lighter ones. Someone asked on Reddit how to gauge a programmer's C++ competency.
And their first reply was, ask them their opinion on a language.
If the first thing they do is sigh, they're probably okay.
And I also wanted to show you this tweet from Ben. If you ever code something that
feels like a hack but it works, just remember that a CPU is literally a rock
that we tricked into thinking. And a small follow-up from the same person.
They say not to oversimplify. First you have to flatten the rock and put lightning inside it.
So yeah, we work with rocks.
Indeed, it's all magic smoke. If it escapes, then you die.
Right, that's it. Thanks for coming and I'll talk to you next time. Bye-bye.
Thank you, bye-bye.