Transcript
Discussion (0)
Hello and welcome to the C++ Club. This is episode 17 for the meeting number 142
that took place on the 20th of January 2022.
Bjarne Stroustup gave a new interview to the hosts of the YouTube channel called
Context Free and the links are in the show notes.
The first part is more general, the second part
is more technical. James Webb Space Telescope uses C++. During a YouTube Q&A session, the
GAWST team stated that the telescope onboard software is written in C++. There are two Reddit threads on that, links in the show notes.
Looks like the OS is VXWorks, and there is a JavaScript interpreter there,
which is used to read text files with parameters and issue commands to instruments.
The company behind this script-ease JavaScript, was called Nombus.
No MBAs?
And it got acquired by OpenWave,
so it doesn't exist anymore.
There is a history of Nombus
and a technology evaluation document
for this telescope.
There is also a document called
Event-Driven James Webb Space Telescope Operations
Using Oddboard JavaScripts. event-driven James Webb Space Telescope operations using onboard JavaScript.
One of the Redditors in the thread says that the CPU that powers JWST is RAD750,
which is a modified PowerPC 750 CPU, similar to the CPU in the Nintendo GameCube.
GCC-12 supports Mold Linker.
GCC 12 just added support for the amazingly fast Mold Linker,
which leaves other linkers in the dust, including LLD.
Just look at those sweet benchmarks.
2021 C++ Standardization Highlights
Bottond Ballot writes,
The ISO C++ Standards Committee has not met in person since its February 2020 meeting
in Prague.
However, the committee and its subgroups have continued their work through the remote collaboration, and a number
of notable proposals have been adopted into C++23, with many others in the pipeline.
I've been less involved in the committee than before, so this post will not be as comprehensive
as my previous trip reports."
Bottonde describes remote collaboration in the committee, where subgroups would meet
on Zoom throughout the year, with virtual plenary meetings and votes three times per
year.
He lists some key procedure documents – the C++ International Standards Schedule, an outline
of C++23 priorities, an outline of remote collaboration model, papers and mailings, and the GitHub
issue tracker for papers.
New to me was the GitHub forwarding link for papers.
That's https, wg21.link, slash p and then the number of the proposal, which I knew before, but apparently if you append another slash and
write GitHub, it will take you to the GitHub progress status page for that particular paper.
Botton lists the proposals selected for C++23 and links to the appropriate papers,
as well as technical specifications and papers in progress.
He pays special attention to the following topics. Fixing the range for loop, the rejected proposal
to address the issue of lifetime of temporaries. Quote, unfortunately, the proposal narrowly failed
to garner consensus in UG, with some participants feeling that this was too specialized a solution that would not address similar cases of dangling references in contexts
other than the range-based for loop, such as auto refref range equals foo and
then in the parentheses you have a function called temp, which produces a
temporary. I think the counter argumentargument here is that in these
cases the dangling reference is explicitly visible,
whereas in the range-based for loop it's hidden inside the implicit rewrite of
the loop." Perfect is the enemy of good, am I right?
Another issue Bottond talks about is named arguments making a comeback.
And Bottond finishes with this, quote,
Collaboration for the time being continues to be remote.
As of this writing, the earliest in-person meeting not to be definitely cancelled is the one in July 2022. It remains to be seen whether we will in fact be able to
hold this meeting in person. Whenever the committee does resume in-person meetings,
they're likely to at least initially be hybrid, meaning there will be audio-visual equipment to
allow continued remote participation for those who prefer it.
A little update.
The face-to-face meeting in July 2022 has been cancelled. And also I've received credible feedback that mixed meetings don't really work.
Almost always unsigned? Dale Weiler wrote an article called Almost Always Unsigned,
and I'm not sure what to think. I thought this question has been put to rest and everyone agreed
that unsigned should only be used for bit manipulation, security, and serialization,
but not for arithmetics. Seems not everyone thinks that.
Quote, the need for signed integer arithmetic is often misplaced, as most integers never represent
negative values within a program. The indexing of arrays and iteration count of a loop reflects
this concept as well. There should be a propensity to use unsigned integers more
often than signed, yet despite this, most code incorrectly chooses to use signed integers almost
exclusively." Dale examines all the usual arguments in favor of signed integers and
tries to dismantle them with varying success, sometimes using questionable C++ code.
He addresses the following points in favor of signed. The safety argument, dealing with overflow and underflow. Loop variables, for which his solution is just to use reverse iteration,
which is completely natural looking. The difference of two numbers can become negative.
Computing indexes using signed integers is safer.
Yes, it is.
Unsigned multiplication can overflow.
Sentinel values.
Come on, nobody uses this to argue in favor of signed integers.
And it's the default.
Yeah, STL sizes are unsigned.
There is a section called
Your counter-arguments are about pathological inputs
in which the author dismisses the argument out of hand.
His arguments for unsigned are
Most integers in a program never represent negative values.
Wow, talk about generalization.
Compiler diagnostics are better for unsigned, but that's worse overall.
And checking for overflow and underflow is easier and safer.
Since C and C++ make signed integer overflow and underflow undefined, it's almost impossible
to write safe, correct, and obvious code to check for it.
He presents a code snippet to check for signed overflow-underflow,
which is quite convoluted, I must admit.
He says,
Your code will be simpler and faster,
and it actually works.
Okay, sure, I'm convinced now. Dale concludes, quote,
you might be wondering how possible it is to actually use unsigned almost always, as this
title suggests. It's been my personal preference for half a decade now, and I haven't actually
missed signed integers since switching to it. It's made my code much easier, cleaner, and more robust,
contrary to popular wisdom. I would not suggest trying to use it in an existing codebase that is
mostly signed integers. In such contexts, you're more likely to introduce silent bugs and issues
as a result of unsafe typecasts. But consider trying it next time you start a new project.
You might be pleasantly surprised.
End quote.
This article unsurprisingly generated a lot of noise on Twitter.
What was surprising to me were some very well-known C++ people
expressing their opinions on this topic.
And being wrong.
The conversation started with Arvid Gestmann tweeting link to the above article and commenting
almost always signed.
This prompted Peter Summerlad to quote his tweet with the comment
mostly correct with some bad C++ examples
i.e. using new to allocate an array instead of making it unique
could also benefit from being less polemic
and I cannot understand the obsession
with using signed integers by some.
That was really disappointing to read.
Joel Falco replied,
unsigned are not made for arithmetic.
Daniela Engert joined the discussion
on the unsigned side.
I don't think this is true, she says.
Joel, they're rather dangerous. They're
not modeling the proper subclass of integers. Daniela, what is the proper subclass of integers
in the first place? I rarely, if ever, need and use signed ints in my code. So do my colleagues.
Joel, ones where a minus b is less than a if b is positive.
Daniela, so you never have been working in an application domains where this is a required
feature? It all depends on the intrinsic nature of the entities you want to model.
If you want to model a subset of z, then go with ints. If you want to model a subset of n0, then go with unsigneds.
Tobias joins in on the signed side. But n0 does not have well-defined subtraction.
Like if you do subtraction, you should not use unsigned.
The issue I have with unsigned is that 0 is just too close to the edge to be comfortable.
With signed, at least your int max steps away from overflows in both
directions. Daniela, it certainly has a well-defined subtraction if you meet the preconditions.
If you can't satisfy the preconditions, then you are in a situation where subtraction doesn't make
sense in the first place, i.e. where you have to think about the meaning of your model.
Joel, yes, therefore using unsigned for stuff like size of containers is a bad idea.
Size is not n0, it's z.
Tony Van Aert.
Programmers expect numbers to work like numbers.
Signed works more like numbers than unsigned.
Peter insists.
How can size be negative?
Nobody programming should expect Cs built in types to work like numbers. They aren't. Tobias, all operations that are well defined on numbers
work exactly the same way in C, addition, subtraction, multiplication. Division can
be sensibly interpreted as returning a pair A divided by b, a model of b, which fits as well,
except for really large values, this is true for signed.
For unsigned, you only need really small values to run into an overflow.
That is much more likely than a signed integer overflow.
Daniela, I strictly oppose this notion at a fundamental level. Your modeling of container
sizes in Z is equivalent to add two more elements to the existing container to make it empty.
A valid operation. UB. She continues. And therein lies the problem. Expectations don't always hold.
To me, selecting or creating the most appropriate type to model an entity is the
most noble art in software design. If you fail at that in fundamental aspects, you're lost.
Joel. The container invariant is size larger or equal to zero, so it can't happen. It should not
be the invariant of the type containing the size. Tony size. agreed, but it is rare to see anyone make a type that accurately models
numbers. Instead 99% of the time everyone just picks a readily available type that
is closish. Signed is closer to expectations than unsigned. They both
suck, but pragmatically signed sucks less.
Peter. Unfortunately, signed integer arithmetic is very prone to surprising and undefined behavior
in C++, especially in cases where operands are of an unsigned flavor. I think we can agree that
none of the built-in arithmetic types is without problems and combining them more.
Daniela raises the stakes. Home assignment. Implement cryptography with an algebra where this property holds true. Joel counters. Home assignment. Don't move goalposts between each
tweet. If you want to assume something is always positive, it expresses your intention much more clearly.
Peter.
I'm sorry, I don't get it.
If I have the invariant that values must be 0 or positive,
why am I supposed to not use a type
that only has 0 or positive integers to represent it?
Joel.
It's an invariant of the container,
therefore the container expresses it at its level.
Invariant of unsigned types are implicit and lead to a false sense of security
as soon as people try to use unsigned types as arithmetic types.
So there you have it.
This thread shows that signed versus unsigned argument is far from over.
Daniel Engert and Peter Samoladada firmly in the unsigned camp,
whereas Arvid Geschmann, Joel Falco, Tony Von Erd and others,
including myself, think signed integers are better at modeling numbers.
Some quotes from the Reddit thread are below.
Looks like Redditors think almost always unsigned is a bad
advice. Quote, my experience has been the opposite. Unsigned arithmetic tends to contain more bugs.
Code is written by humans and humans are really bad at reasoning in unsigned arithmetic.
Quote, But Daniela Engert repeats,
I like this article as it matches my experience from decades of software development.
Robert Ramey says,
The dispute is never ever going to be resolved,
but until it does, use BoostSafeNumerics.
It's his library, by the way.
Useful.
Let's look at the core guidelines.
ES106.
Quote.
Don't try to avoid negative values by using unsigned.
Choosing unsigned implies many changes to the usual behavior of integers,
including modular arithmetic.
Can suppress warnings related to overflow
and opens the door for errors related to signed-unsigned mixes.
Using unsigned doesn't actually eliminate the possibility of negative values.
Core guidelines ES107.
Don't use unsigned for subscripts.
Prefer GSL index.
See Microsoft's GSL.
See also a good presentation on this topic
from CppCon 2016.
John Kalb unsigned a guideline for better code where he
explains very well why using unsigned for arithmetic is a bad idea
Little C++ standards library utility StudAlign. Les Lelay posted an article
about StudAlign in which he describes a use case, then implements a
helper function manually, and finally replaces it with std align.
Quote,
Arena, also called bump allocator or region-based allocator, is probably the simplest allocation
strategy.
It is so widely used that even the C++ standard library has an ARENA implementation called
std pmr monotonic buffer resource. Talk about intuitive names there.
With ARENA, we first have a large chunk of pre-allocated memory. The chunk of memory itself
can either come from this stack or one large allocation of another allocator, such as malloc.
Afterward, we allocate memory from that chunk by bumping a pointer offset.
A real allocation has outstanding performance characteristics,
especially compared to complicated beasts like malloc.
Each allocation only needs to bump a pointer,
and the deallocation is almost free if the objects allocated are
truly destructible. It is useful in situations where we have a lot of heterogeneous allocations
that only need to be freed together, and is widely used in application domains from compilers
to video games." When allocating memory in an arena, we need to deal with alignment.
Incrementing the next pointer by the size of allocated object is not enough,
starting the lifetime of objects on unaligned locations in undefined behavior.
Leslie then explains how to handle alignment,
including a full implementation of a helper function that bumps arena pointer
to the next allocation address and updates the
remaining area space. Turns out that's what std align is for. From cpp reference, given a pointer
to a buffer of size space returns a pointer aligned by the specified alignment for size
number of bytes and decreases space argument by the number of bytes used for alignment.
The first aligned address is returned.
End quote.
We still need to change
butter and space according to
the actual allocation size
as they are only bumped by the number of
alignment bytes.
So here it is.
Stoodle line.
Very useful function for a very limited use case.
Tstar makes a poor optional TREF.
Barry Revzin writes, quote,
Whenever the idea of an optional reference comes up, inevitably somebody will bring up
the point that we don't need to support optional TREF because we already have in the language
a perfectly good
optional reference T star. The purpose of this post is to point out that despite these similarities,
a pointer to T is simply not a solution for optional T ref. End quote. As an example,
Barry chose a function that returns the first element of a range or nothing if the range is empty.
He develops the function over several iterations and demonstrates that in order to support any range the return type must be optional T ref. He shows that T star doesn't actually work
and optional of T is also unsuitable for some input range types, notably vector of T, which has reference
type of T ref. When the return type is optional of ranges reference to R, where R is the input
range type, it resolves to optional of int for ranges of int. But for vector of int, it resolves
to optional of int ref. The author shows that
returning pointer to int instead doesn't work without a lot of workarounds. If you wanted to
have a nice usage where a default value is returned if the input range is empty, it doesn't
work with pointers unless more workarounds are provided, and then the call syntax is not as nice. Barry takes a quick aside to remind
us why vector of bool is bad, and then writes, quote, if we all agree that vector of bool is bad
because of several subtle differences with vector of t, then surely we should all agree that t star
is a bad optional t ref because it has several very large and completely unavoidable differences with
optional T. Namely, it is spelled differently from optional T, it is differently constructible
from optional T. You need to write ref E in one case and E in the other. And it has a different
set of supported operations from optional T." Barry talks about the future too.
We don't have pattern matching in C++ yet, and we still won't in C++23. But eventually we will,
and when we do, we'll want to be able to match on whether our optional reference actually contains
a reference or not. We do not need to match whether we are holding a derived
type or not. This is yet another operation that a T star won't do for us. He concludes,
quote, the ability to have optional T ref as a type, making optional a total metafunction,
means that in algorithms where you want to return an optional value of some computer type U.
You can just write optional U without having to worry about whether U happens to be a reference type or not. This makes such algorithms easy to write. End quote.
ReadyThread has much to say about the topic and specifically about what should happen
when assigning to optional tref.
Printing tabular data.
Some links for when you need to output tables in C++.
Table Printer by Ozan Cancel.
A header-only C++ 17 library, no third-party dependencies, MIT license.
CppConsole table by Dennis Samilton.
A header-only C++17 library MIT license.
This one prints table borders using the extended line drawing characters.
And variadic underscore table by Derek Gaston.
A header-only C++ library that uses variadic templates for convenience.
This one prints table borders using standard characters, dash and pipe.
The license is LGPL2.1, so be careful.
Outcomes enters sustaining phase, goes ABI stable.
Niall Douglas announced on Reddit that his outcome library has entered sustaining phase and now has a stable ABI.
Outcome is an alternative error handling library that may be useful when exceptions are not allowed
or when you need deterministic performance for this sad path as opposed to happy path.
Niall provides a comparison of outcome with alternative error handling methods and libraries. Outcome comes as a standalone library and as part of Boost.
Boost.outcome.
There is a clarification of the difference between these versions.
It has been well-reviewed and production-tested
and is available via C++ package managers such as Conan and VC package.
The library requires C++ 14 and comes as a single header under Apache 2 license.
Include guards versus pragma 1s. Again.
A new poll on Reddit shows that developers prefer the non-standard but widely supported
pragma 1s over include guard macros. This thread
contains anecdotal references for and against each option. A game developer at Epic says,
quote, since people were copying files and renaming them without changing the guard name,
we have switched to Pragma 1s because all toolchains for targeted platforms have proper support for it. End quote. That sounds
like they have a bigger problem. Another Redditor reminds about what the core guidelines say about
it. Quote, personally, I tend to follow the CPP core guidelines and use if and def. SF8,
you use include guards for all H files.
Of course, people argue against that too.
Here is an amusing exchange.
Quote, pragma once, it's 2022.
Reply, in 2022, we use import.
Reply, I wish.
That's it for now.
And I'll leave you with this, with a couple of lighter snippets.
This is what I overheard. Life lessons. Life taught me two lessons. The first I don't remember.
And the second was write everything down. Something from Reddit. Rust does generics a lot better than C++. And someone replies,
if a C++ developer is ever stranded in the desert, all he has to do is say to
the empty sans, C++ is pretty okay. A Rust enthusiast will appear immediately
to correct him.
Okay, that's it. Thanks for joining me today and I'll talk to you soon. Bye!