C++ Club - 165. Bjarne Stroustrup's talks on safety, WG21 August mailing, Rust

Episode Date: September 14, 2023

With Gianluca Delfino, Frances Buontempo, Vladimír Arnošt, Ivor Hewitt et al.Notes: https://cppclub.uk/meetings/2023/165/Video: https://youtu.be/WpXW42iYP_k...

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome everyone, this is Meeting 165 and today is the 7th of September 2023. And it's 31 degrees outside in London. Right, the first topic is a video, life advice from Bjarne Stroustrup. from bianna strohstrup this was on the honeypot channel and it's pretty short one just three minutes and something bianna said don't over specialize to which someone on reddit said and that's funny and coming from a person who created C++. But I don't see it that way. C++ is exactly the don't overspecialize idea, because you can't do pretty much anything with it, can't you? Another thing he said, have a life and interests outside computing, which yeah, the more outside interests you have, the more balanced personality you will
Starting point is 00:01:14 be. And that will help given that we have so many peculiar personalities in our technical realm. Also build a portfolio of skills and be ready for the opportunity and don't let it go unnoticed. So all in all, good sensible advice. And on Reddit I liked this comment. Life is like undefined behavior. No one knows what will happen next. Yeah, that's true. Or vice versa is also true. Next video is also by Bjarne. He presented it live in Core CPP Israel recently.
Starting point is 00:02:07 Well, was it recent? Four weeks ago. This seems to be the latest thing to go and defend C++ from all the articles and government organization notes that say don't use it. So it's a bit of a marketing effort, I'd say. It's a very interesting presentation. I'm not going to be able to summarize it in a few words, so just go and watch it. But in short, there were some points that I wanted to repeat. The goal, he said, was type and resource safe C++. And he's been working on it for many, many years. He said that the spooks, meaning NSA, say, don't use C slash C++, as it's unsafe. I sort of agree, said Bjarne. I don't like C slash C++ myself. It's a mythical language,
Starting point is 00:03:17 which is usually a mess. He also said, people choose C++ for a reason. The alternatives are new and shiny, but largely untried and don't handle the range of applications that C++ does. He had a slide on type and resource safety. The bullet points were, every object is accessed according to the type with which it was defined. Type safety. Every object is properly constructed and destroyed. Resource safety. Every pointer either points to a valid object or is the null pointer. Memory safety.
Starting point is 00:04:00 Every reference through a pointer is not through the null pointer. Often a runtime check. And every access through a subscripted pointer is in range, often a runtime check. That is just what C++ requires and what most programmers have tried to ensure since the dawn of time. And the last bullet was the enforcement rules are more deduced than invented. He said you just have to be careful, but being careful is not good enough. And also regarding other languages, which often outsource unsafe parts to other components or libraries written in C++, or even C. He said, we can't just outsource unsafe parts of C++ to another language. He mentioned Array.ai and said, resource acquisition is initialization, must be the dumbest
Starting point is 00:04:55 name for a great feature. Apologies for that, I was busy. End quote. He said, people come and say, C++ is too complex. I want you to simplify it. By the way, while you're at it, I want these two features added, I need them yesterday. And whatever you do, don't break my code. You can't have all three of those. He also said that safety by subsetting the language doesn't work because lots of the low-level features aren't safe. So he proposed something of a subset of superset, which is you need to extend language with a few new abstractions by using libraries, like use the standard library, and add a small library, meaning the GSL, Guideline Support
Starting point is 00:05:49 Library from Microsoft, which helps with messy, dangerous, low-level features, wrapping them up in a safe way. Quote, what we want is C++ on steroids. Simple, safe, flexible and fast. Not a neutered subset. And no change of meaning. The resulting code is ISO C++. End quote. He said that different domains have different definitions of safety, but basic type and resource safety should be common. Arbitrary C++ code is simply too complex for static analysis.
Starting point is 00:06:39 And so he proposes to introduce profiles to help with that. A profile is a coherent set of rules yielding a guarantee. It must be visible in code to indicate intent and to trigger analysis. And it so happens that it's similar to Ada's safety profiles. There is a whole page in Ada documentation on profiles. And the profiles in C++ were created or proposed independent of Ada profiles. Someone else pointed, Bjarne, to that. The hardest problem, he said, was mixing profiles. This is a work in progress, still under development, and there are many ways of doing it.
Starting point is 00:07:27 And it's a difficult problem, because parts compiled under one profile would expect certain safety guarantees from code that might have been compiled under another profile with different guarantees. The controls for that could be module-based, you could enforce some guarantees like memory safety enforced for a particular module, or you can use a special attribute related to profiles to say an import statement, which would mean that if you import std with that attribute, that would mean that memory safety is enforced for all users of std in your
Starting point is 00:08:15 code. Or alternatively, you can suppress certain checks. Like for example, you can import a module with an attribute that suppresses type safety in it. You could apply the same attributes, suppress type safety or enforce type safety, as an example, to a particular piece of code or a variable or a function or whatnot. And that would mean that static analyzer would suppress checks for that particular profile or enforce them for that particular profile, but only when applied to that particular piece of code. The initial profiles suggested were type safety, which means no type or resource violations, range, which means no pointer arithmetic, no null-potential dereference, span and vector range throw or terminate on violations,
Starting point is 00:09:12 and arithmetic, no offload, no narrowing conversions, no implicit signed-unsigned conversions. He expanded on these topics in detail, which is why I suggest you go and watch that presentation. This is like a new iteration of the same presentation that we mentioned before, and I failed to provide a link to it. So that's the new one. Go and watch this one. It's updated. Interestingly, during the video, you had automatic transcription enabled.
Starting point is 00:09:56 So good use of machine learning, I suppose. It's still not AI. But yeah, some of the transcriptions were funny. But in general, it worked pretty well. So there was another talk. It was an interview. The channel Software Daily, I think it's UK-based, interviewed Bjarne. That was five months ago. And so many topics were similar to those discussed in that previous presentation.
Starting point is 00:10:33 It's pretty much a Q&A session where the host raises the same controversial points, like NSA memo, and how easy it is to learn C++ and what about other presumably supposedly successful languages and so on. So this is like a more extended subset of the previous presentation. Still interesting to watch. It's a quite relaxed one, so it was good. At one point Piana said about the NSA memos and the like. It's an approximate quote, like he said. Well, they say don't use C slash C++ but try the solid, which is not even standardized and has no
Starting point is 00:11:25 formal definition, but it's safe. Give me a break." You could see that he was a bit annoyed. On to the next topic. The new committee mailing for August dropped, and there were quite a few papers and some even at revision zero which is always interesting. So I thought I'd go through some of them. There's a reddit thread which starts with insert complaint about reflection. They probably mean that there wasn't anything about either reflection or pattern matching in this set of papers.
Starting point is 00:12:10 Nothing. Oh well. I'm still optimistic. I think we should still complain. Yes, I think we should encourage those complains. Like maintain the pressure. Yeah, but there's still time. I'm hopeful.
Starting point is 00:12:29 Right. So the papers I had in mind were these. Erroneous Behavior for an Initialized Reads by Thomas Köpper. We talked about erroneous behavior previously, and it wasn't clear to me what it was or how would it be implemented in order to prevent UB. So this is a kind of explanation which made it a bit clearer to me. What it says is, quote, we propose to address the safety problems of reading a default initialized automatic variable and uninitialized read by adding a novel kind of behavior for C++. This new behavior, called erroneous behavior,
Starting point is 00:13:23 allows us to formally speak about buggy or incorrect code, that is, code that does not mean what it should mean, in the sense we will discuss. This behavior is both wrong in the sense of indicating a programming bug and also well defined in the sense of not posing a safety risk. This erroneous behavior relies, at least in this paper, relies on a previous proposal, if you remember there was a proposal about default initialized local variables. And this builds on top of it. The quote continues we propose to change the semantics of reading an uninitialized variable. Default initialization of an automatic variable initializes the variable with a fixed value defined by the
Starting point is 00:14:14 implementation. However, reading that value is a conceptual error. Implementations are allowed and encouraged to diagnose this error, but they're also allowed to ignore this error and treat the read as valid. Additionally, an opt-out mechanism in the form of an attribute or on a variable definition is provided to restore the previous behavior. So it's not a UB because we have default initialization, but you still can't do it. Well, you can, but it's an error because you are not initializing the variable yourself explicitly. But this allows... I am confused.
Starting point is 00:14:59 Right. Yeah. This is pretty confusing. So in order to avoid VB, the variable has to be initialized. So this is done implicitly by the compiler. But at the same time, the compiler can or cannot report it as an error. So I'm afraid about users in the case that the compiler didn't report it as an error, which it is an error nowadays that the compiler can report on anyway. But if we did introduce this and still the compiler did not report it as an error, then maybe some users would start relying on the fact that the compiler initializes it and doesn't report it. Assuming that things are initialized to zero, and then maybe ported to another compiler that does not do that, and there you go. You then have something that explodes. True. Users can be pretty ingenious with rules violations, we all know that.
Starting point is 00:16:08 So at least this won't be UB, and there will be a way to detect it, and presumably this will be a... always be a warning, or will be highlighted by the static analyzer. Probably a compiler warning, I would say. So yeah, it's weird, isn't it? It's initialized, but you can't use that value because it's an error. And you shouldn't also assume anything. So I mean, at this point, it's not very useful. And if it was useful, I think the users would maybe rely on it when they shouldn't. So I don't know, I'm not sold., like we said, it's a default initialization, and using that is an error, but not UB. And also, if you don't want to use it, you can use an attribute to disable that if you say you don't want a default initialization because it's expensive for something. Do you think this is also a reaction to the NSA thing?
Starting point is 00:17:35 Let's add this thing so we can say that we do not have at least this problem? Oh, I think about a third of the papers is a reaction to that NSA memo, including this one probably. But yeah, it's kind of... even putting aside the NSA memo, it's an attempt to solve the problem. I mean, UB is bad. Anything to address it is probably beneficial. I don't know. Yeah, but I also wouldn't want the users to think, okay, maybe I can rely on the fact
Starting point is 00:18:19 that the compiler will somewhat initialize it to something. Although then I cannot still use it. Maybe then, you know, if the compiler chooses to initialize it to something and then not report the error of me using it, then still people will just think, okay, fine, that's a behavior. I'm just going to say that that's fine. Actually, that's a good point. How does this work with the paper that is sort of based on? If, say, we have default local variable initialization, that assumes that the compiler will default initialize variables to zero or whatever their default value is. And so users will start relying on that. But this is sort of an opposite. It is. I think those are mutually exclusive, either this or that, and the other one is more extreme than this and more difficult to implement. I think Timo Dummler had a good argument in favor and against the initializing stack variables.
Starting point is 00:19:30 And I wish I remember which talk it was, one of the recent ones. But yeah, I mean, it's kind of warm, you know, and even with the good intentions to initialize things, standardized initialization of stack variables, which sounds like a no-brainer, even with the good intentions to initialize things, standardized initialization of stack variables, which sounds like a no-brainer. There are catches and things that would break, obviously.
Starting point is 00:19:57 So I don't see that happening, unfortunately. It's interesting. Yeah, these papers are mutually exclusive, like you said. So which one gets accepted? Neither, probably. If the default initialization paper gets accepted, then there's no erroneous behavior. Yeah. Quote from the paper.
Starting point is 00:20:23 In other words, it's still wrong to read an uninitialized value, but if you do read it and the implementation does not otherwise stop you, you get some specific value. End quote. Yeah. Well, we'll see how this goes. This sounds, you know what, like an object that has been moved out from it. It's in an unspecified but valid state. Yeah, yeah, yeah, exactly. Right. The next paper was span.add by Jared Waterloo. This proposes to add the function add to standard span and says, quote, this new method is safe in the sense that it has defined behavior instead of undefined behavior. Further, the defined behavior is one that can be caught in the code by catching
Starting point is 00:21:26 the exception. And just for consistency sake, all other containers have at alongside the subscript operator, but span doesn't. So this could be like a consistency fix. Yeah, why not? Sounds good. Yeah. Method. Member function. C++. Right. Next one is related to contracts. The title is an Attribute-like Syntax for Contracts by Joshua Byrne. And this is one of the syntax-oriented proposals, because the minimum viable contract proposal sort of sidestepped the issue of syntax and assumed that it will be decided afterwards, so they avoided going with a specific syntax.
Starting point is 00:22:33 But this seems to be one of the papers that address this, and it proposes an attribute like syntax, which we've seen before. It's just probably a reiteration. Although this particular paper is at vision 0. This is what the code looks like. You have a function signature int conditions, preconditions, and post conditions as well in the same block. And the same syntax is used for assertions within the function. And each condition looks like an attribute, and the attribute can be pre, post, or assert. Then you have a colon followed by the actual condition, which presumably is a typo, it should probably be
Starting point is 00:23:50 f, equals the result of the function call equals x, and then some other post conditions r is probably return value or something. Anyway, it looks like a bunch of attributes. And you can also apply them to lambdas and such. So yeah. Let the bike shedding commence! Still, contracts progress, which is good. Next paper is by the same person, Jared Waterloo, Safer Range Access. I only briefly skimmed it, but I noticed that there are lots of attributes. Presumably each attribute, like an unsafe attribute, which lists reasons for this, for example, function declaration being unsafe. That could be range, dangling reference, reference invalidation, and so on. So this reminds me of an earlier paper by the same person, I think, which proposed sort of safety-oriented markup for your code.
Starting point is 00:25:15 Again, with attributes which made the code like a wall of attributes. It's extremely noisy and it's pretty much... I don't know, I don't like it. And there's another paper called reference checking, again by the same person. He seems to like attributes, more attributes. This one introduces attributes to check references returned from a function that say use reference parameters. And he says, quote, this paper proposes that we allow programmers to provide explicit lifetime dependence information to the compiler for the following reasons. Standardize the documentation of lifet compiler for the following reasons. Standardize the documentation of lifetimes of APIs for developers, standardize the specification of lifetimes
Starting point is 00:26:12 for proposals, and greatly reduce the dangling of the stack for references. What is being asked for is similar to, but not exactly like, Rust's feature called explicit lifetimes." And the way he proposes to do it is more attributes. I don't know. I'm not convinced. Well, it is a tricky business. I mean, I commend this for sure. This is one of those things that people go to Rust for. At the same time, it is kind of ugly. So it's difficult for sure.
Starting point is 00:26:55 Do you know if in Rust you also have to write lots of annotations? Not always. Not always. There is a subset of cases the compiler can deduce it for you. But I think that only applies for the simple cases, which may be 90% of the code. I'm not sure. I'm not a Rust developer. But you end up with lots of places where you have syntax where you get these apostrophes and these lifetimes A and Bs, like you see at the very top of your screen. That's kind of a common thing to see anyway. I wouldn't want to write these kinds of attributes for all the functions.
Starting point is 00:27:43 No, me neither. But if it can be achieved in any way, it probably won't be pretty. So I don't know. I wouldn't know, honestly. What if you specify that a particular parameter depends on something and then the code gets updated and your attribute gets out of date? Like a sort of a comment? Yeah, I don't know. Again, I think we'd have to copy from what Rust does. I would prefer magic, to be honest. I would prefer a compiler magically knew what was going on and didn't need to write any annotations. That would be great. But I think the problem is across translation units and then there's really not much you can rely on the compiler for. I think cross translation units,
Starting point is 00:28:40 I think that's where Rust also requires you to be more explicit. But again, I think an expert in Rust would have a better answer there. We should find someone for this podcast, a Rust expert who would comment on anything we say and says, Rust has it fixed. What nonsense is this? We don't want to put off any potential guests. We welcome you, Rust, available. Right. Okay, so speaking of Rust, we go smoothly to the next topic. Someone posted on Reddit,
Starting point is 00:29:24 considering C++ over Rust. And the quote is, To give a brief intro, I have worked with both Rust and C++. Rust mainly for web servers plus CLI tools and C++ for game development. Recently one of my friends who is a JavaScript dev said to me in a conversation, why are you using C++? It's bad, and Rust fixes all the issues C++ has. That's one of the major slogans Rust community has been using. And yeah, the creator of Node apparently says, I won't ever start a new C++ project again in my life. On the other hand, I've been working with C++
Starting point is 00:30:06 for years heavily with Unreal Engine and I've never in my life faced an issue that usually the Rust community lists. And the poster wanted to ask the people at the CPP subreddit, what's your take on that? Did you try Rust? What's the reason you still prefer using C++ over Rust? And to be honest, that discussion... I didn't hate it. It was pretty good as far as the CPP subreddit is concerned. The first commenter says, for a long time Java was going to kill C++. And I remember a period when Go was going to kill C++. Beside these big ones, there have been plenty of other languages that were popular enough. I got questions why projects were in C++ rather than them, given they were a hot thing, including stuff like D or Scala
Starting point is 00:31:01 that are largely forgotten now, but for a while had a lot of mindshare. Maybe Rust will actually do it, maybe Zig will. So yeah, the quote continues, an issue of new languages like Rust is their users are all programmers who decided to use the latest coolest language. That means if something new comes out, their users are the sort of people who will jump ship to that. Folks still programming in C++ have chosen not to jump ship many times before, so I'm pretty sure the language will be at least fairly popular for a long time. Sorry, Java cannot kill C++ because you still need a language to implement the Java Virtual Machine in, which is something we did
Starting point is 00:31:42 with the garbage collector, and it actually gets compiled to binary. Yeah, there's lots of supposedly successful languages that are using, for example, the LLVM project, which is written in C++, to compile that. I think Rust is self-hosted now, and I think Zig is also self-hosted now. But they all start with C++. There was also a very long comment from someone who used both and listed all the pros and cons. I think that's a repeat of a comment we've seen before.
Starting point is 00:32:24 It's because the points were pretty similar to what I remember from that comment. Oh, there you go. This is from James20k. He says, I've been using C++ for too long, and these are the things that I think Rust solves. And then lists all the points. But another paragraph says, downsides to Rust.
Starting point is 00:32:46 And then he again lists quite a few points. So there's pros and cons. Lots of good, insightful replies. Good discussion. I liked it. One Redditor quoted James Mickens. I encourage you to look him up. He's a treasure to the programming community.
Starting point is 00:33:11 And he wrote many things that I consider the best posts ever. And so one of his quotes was, he meant Lisp, but someone in the thread applied it to Rust, in the sense that Rust used to also require C++ to compile basically. And the quote goes like this. You can't just place a Lisp strike Rust book on top of an x86 chip and hope that the hardware learns about lambda calculus by osmosis." End quote. And another quote I had from this thread was again by James20k. I found it irritating that some of the same C devs who have been in denial about C++ for decades now rave about Rust. And someone mentioned Linus Torvalds in that context, who now allows Rust in Linux kernel, but still hates C++, as far as I know. And there were comments like this from Dean Roddy, who migrated from C++ to Rust quite happily.
Starting point is 00:34:33 Again, he lists areas and points where Rust has clear advantages over C++. So yeah, if this works for you, go ahead. I wanted to show you this excellent YouTube channel. It's called Dave's Garage. This is by a guy called Dave Plummer. He used to be a Microsoft developer in the 90s and this channel is lots of tech stuff, including really interesting memories from back then. Like he was the one who created Task Manager
Starting point is 00:35:17 and wrote some like Windows protection code and such. So his anecdotes from the old days at Microsoft are very interesting, to me at least. I love old school computing and retrocomputing. Shows my age probably, but hey. I can confirm. This is the two of us showing their age. Dave's Garage is a very good channel. So, yeah, one of the topics he covers occasionally is C++, and he calls these stupid C++ tricks with Dave. And funnily enough, he uses modern C++, not the 90s stuff.
Starting point is 00:36:05 So that's good. He's a great presenter, and he goes through code. And yeah, sometimes he goes and reviews the old code, and sometimes he just talks about how to use modern C++. So yeah, I think it's a good pastime. I like him. Like, for example,'s a good pastime. I like him. Like, for example, one of the anecdotes. At Microsoft, they never called blue screen of death a blue screen of death.
Starting point is 00:36:34 Only blue screen or a bug check, as it was initiated by calling the kernel function keBugCheck. Right. Next topic, a quick one is a node js is moving to a new faster url parser called ada written in modern c++ putting aside the name which means like they author didn't know about the Ada language or just didn't care. Someone says, written in modern C++, and literally the first lines in Ada.h are macro definitions. And this didn't work and was not a power to mode checking.
Starting point is 00:37:22 Yeah, and not to mention memcpy function calls, apparently. Someone says on modern Clang and even at even a low optimization level, stdcopy is optimized to memcpy anyway. And still we use that. Modern C++, right. Actually, I've seen STD copy being optimized into memmove, which is slower than memcpy. So there are reasons for performance changes to the memcpy menu. Okay, I stand corrected. Interesting. Another document that someone pointed me to was Chromium OS Documentation Development Basics. And they have a section on programming languages
Starting point is 00:38:06 and style. And that includes Rust, C++, C, Shell, and Python. Note the ordering. Rust is first, C++ is next, and then C and Shell and Python. I'm sure that's not significant in any way. But then they have this paper which is called Modern C++ Use in Chromium. And they say this document is part of the more general Chromium C++ style guide. The interesting thing is that they list what C++ standards are supported and what features from each standard are banned. C++ 11 is default allowed except banned features. C++ 14 is default allowed. Nothing is banned.
Starting point is 00:39:10 17 is supported from December 2021. 20 is not yet supported, with the exception of designated initializers. 23, obviously not, not yet standardized. But then you go to the contents and see what features are banned. And it's like a lot. A lot of standard libraries are banned. And they provide a justification for each banned feature, for why each feature was banned.
Starting point is 00:39:50 Like inline namespaces, unclear how it will work with components, long long type, use a student.h, not even C student. And then you have tons of features. I like this one. Exception. Banned. Notes. Exceptions are banned by the Google Style Guide and disabled in Chromium compiles. Except the no-except specifier. So,
Starting point is 00:40:17 we've long known that Google doesn't use exceptions. I think that precludes them from calling it modern C++. It's just me. So yeah, if you have nothing to do and want to see what features of C++ are banned at Google, then yeah, go and read that. I might
Starting point is 00:40:47 happen to know why they banned the random generator. I guess I'm going to have to go and do homework. That was bizarre. Just go on just down a bit. Engines and generators from random
Starting point is 00:41:03 are banned. Maybe they don't trust it from a security point of view. Maybe they would impose random generation from open SSL libraries and the like. No, they say instead use base random bit generator they have their own function apparently in chromium but i suspect the reason for banning could be because of how hard is to use random properly at the moment you have like a long initialization sequence that you have to follow in order to initialize the engines properly and that is a subject of a proposal currently in flight. Yeah, I've seen that proposal. Yeah, it's going to be interesting to see where,
Starting point is 00:41:52 if we do end up with some changes. I think for most cases, just seeding it with one number is good enough. It depends what you're trying to do. Obviously, no good for cryptographically secure things and so on. I should do some reading. That looks interesting. Yep. So yeah, coming closer to the end,
Starting point is 00:42:16 I wanted to show you this article, Why Rust Will Replace C++ in the Future. Roy is a chief architect and co-founder at BLST Security, proficient in AWS, Rust, and Python. And I started reading it and I had this vague feeling. Yeah, you're right. It just looks auto-generated. You just read sentence after sentence. Exactly. It's like that's what a ChatGPT article looks like.
Starting point is 00:42:54 And it goes on and on. It's quite a long one. The benefits of using Rust. Why Rust will replace C++. In the past, C++ has been the go-to language for systems programming. Overall, Rust has many advantages over... Wow! You can try. You can paste those bold headings into chat GPT and see what output you get. Maybe
Starting point is 00:43:18 these are the prompts in bold and the paragraph is... I mean, yeah, life's too short to prove or dispel the idea, but for me it definitely looks auto-generated. And the irony is that the article was probably generated by the code written in C++ in the end. Regarding C slash C++, Predrag Grievski writes on Mastodon, including C slash C++ on my resume has been a great way of selecting out people that would be the kind of well actually that i don't think i'd enjoy working with yeah that's a take and this one from mastodon by akin yan if pragma once is so good, where is pragma twice? And finally, from Daria Obasanjo, who posted this picture.
Starting point is 00:44:36 It's a picture of a small clock taped onto a bigger clock. Quote, a broken clock fixed by taping a working clock over it is a metaphor for every codebase you'll encounter in your professional career as a software developer.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.