CppCast - Catch2 and std::random

Episode Date: May 21, 2020

Rob and Jason are joined by Martin Hořeňovský. They first discuss some ISO papers and Jason learning Rust from his cousin Jonathan. Then Martin tells them about his work maintaining Catch 2, includ...ing his plans for future updates of the unit testing library. Martin also talks about SAT solvers and problems with std::random. News 2020-05 Standards mailing Jonathan Teach Jason Rust!) C++ Events affected by Coronavirus Links Catch2 CppCon 2019: Martin Hořeňovský "Solve Hard Problems Quickly Using SAT Solvers" P2058 Make std::random_device Less Inscrutable P2059 Make Pseudo-random Numbers Portable P2060 Make Random Number Engines Seedable Sponsors PVS-Studio. Write #cppcast in the message field on the download page and get one month license Read the article "Checking the GCC 10 Compiler with PVS-Studio" covering 10 heroically found errors despite the great number of macros in the GCC code. Use code JetBrainsForCppCast during checkout at JetBrains.com for a 25% discount

Transcript
Discussion (0)
Starting point is 00:00:00 Thank you. tools like IntelliJ, PyCharm, and ReSharper. To help you become a C++ guru, they've got C-Lion, an Intelligent IDE, and ReSharper C++, a smart extension for Visual Studio. Exclusively for CppCast, JetBrains is offering a 25% discount on yearly individual licenses on both of these C++ tools, which applies to new purchases and renewals alike. Use the coupon code JetBrains for CppCast during checkout at JetBrains.com to take advantage of this deal. In this episode, we discuss some ISO papers and Jason learning Rust. Then we talk to Martin Hornofsky. Martin talks to us about the maintenance my co-host, Jason Turner. Welcome to episode 248 of CppCast, the first podcast for C++ developers by C++ developers.
Starting point is 00:01:46 I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how's it going today? I'm all right, Rob. How are you doing? I'm doing okay. I don't really have much news to share. You know, just another week under these current somewhat stressful conditions. Yeah. Yeah. current somewhat stressful conditions yeah uh yeah uh so the i i am definitely not teaching now at the um conference and uh indec tech town in norway okay uh because that would have been remote and
Starting point is 00:02:19 decided to not go with that and uh well i mean we'll see what happens yeah my class in stuttgart is still on as far as we know okay when is there like a date that you would have to make a decision on whether or not that might become canceled or as far as i know we're going to wait as long as possible and see how things are going in germany because that's the end of September, so that still gives us a pretty good window. That's another four months, right? Yeah. So there's still plenty of time to decide on that one. Okay.
Starting point is 00:02:52 Well, at the top of every episode, like through the piece of feedback, we got this comment on Reddit on last week's episode, which was about physical units. And this user, Phil, said, Personally, I don't think adding physical units to the standard is about physical units. And this user Phil said, personally, I don't think adding physical units to the standard is a good idea. Maintain and use it as a third party library, but don't force every compiler to ship it. It's the same reason we shouldn't add things like a
Starting point is 00:03:14 currency class to the standard. I don't know about you, Jason, but I'm biased. I like the idea of having physical units in the standard because I would probably make use of it. But what do you think? I certainly think it would be more useful than a currency class and less likely to change frequently. Yeah, I agree. Physical units are fairly constant. It's probably I think it's the arguments that were made last week that something that's in the standard is like 10 times more likely to get used than something that's not as important to consider. And something like units is important for correctness. We have lots and lots of code that's not correct that tosses units around right now.
Starting point is 00:03:53 Yeah. And I don't remember if he mentioned it in last week's episode, but there were cases where we've used the wrong units in some library code before and you know uh space missions like nasa missions went badly because of it things don't go right yeah it's bad very bad okay well we'd love to hear your thoughts about the show you can always reach out to us on facebook twitter or email us at feedback at speakass.com. And don't forget to leave us a review on iTunes or subscribe on YouTube. Joining us today is Martin Hornofsky. Martin is currently a researcher at Loxley,
Starting point is 00:04:31 where he works on converting large master key systems to SAT. He has taught modern C++ at Czech Technical University in Prague and maintains Catch-2, a popular C++ unit testing framework, in the little free time he has left. Martin, welcome to the show. Hi, thank you for inviting me. popular C++ unit testing framework in the little free time he has left. Martin, welcome to the show. Hi, thank you for inviting me. I'm looking at your bio, and I'm like master key systems to SAT,
Starting point is 00:04:52 and I have absolutely no idea what that means, but I guess that's part of what we're going to talk about with you today. Oh, yeah. I'll explain both. Okay. Okay, sounds good. Well, Martin, we got a couple news articles to discuss. Feel free to comment on any of these, and we'll start talking more about your work on Catch-2 and maybe some more about SAT stuff, okay?
Starting point is 00:05:11 Okay. All right. So this first one we have is a mailing from the C++ ISO committee, and I looked through a couple of these. I want to know if either of you have anything you wanted to highlight. One thing I thought was good was a paper on secure networking in C++ because it would be great if we finally get
Starting point is 00:05:35 networking soon, but it would be a shame if it was only HTTP. So it's nice that they're starting to look at adding secure networking as part of it. And then there's also a paper on evolving C++ remotely, which kind of just lays out how they're going about standards meetings right now. Yeah. I like some of the little ones, like the nice placeholder with no name by Corentan, because I write in other languages which use underscore
Starting point is 00:06:05 as ignore this so I would love to have it in C++ as well another paper references that one, I don't remember which one it was, I looked at the other paper first universal template parameters perhaps, and it looks like they're
Starting point is 00:06:20 suggesting double underscore as the placeholder yeah it's the C++ solution, right? And it looks like they're suggesting double underscore as the placeholder. Yeah, it's a C++ solution, right? You can't use single underscore because people already use it and you don't want to change anything, so you use double underscore. You know, it's more ugly. Okay. We might have to be careful there.
Starting point is 00:06:40 There's probably double underscore identifiers in people's code. We'll have to use triple underscore and then it'll all be better it's ub that's not our problem that's ub i actually have one code base i'm involved in right now uses single underscore for placeholders where reading the code i actually think that just for the particular use cases that they have just an empty set of braces to say default initialize this thing would actually be better in most of those cases, particularly since it's a hack of a thing that we have anyhow. Yeah. Yeah. And one caught my eye here, and this is a conversation I've had with Ben Dean at one point, attributes on lambda expressions, which is David Inball and Ville involved in it.
Starting point is 00:07:28 And it's interesting, if you look at the grammar for lambda expressions, you're allowed to put attributes in one place, and that one place doesn't do what you think it would do. It doesn't apply to the return type or to the type of the lambda. It's just like, why is this even here? Multidimensional subscript operator oh yeah we deprecated it for c++ 20 so the proposal is to for c++ 23 actually enable it so you can have multiple arguments are you familiar with that one rob that they deprecated the use of the comma operator inside the subscript operator? No, I don't think I am familiar with that one. Why was it taken out in 20?
Starting point is 00:08:08 Because it was always just a hack and never did what you intended it to do. That's not true. There was actually, like, one use of it in, I think, tests of some Boost library. Right. But, like, yeah, nobody used it anyway. I actually just recorded an episode for c++ weekly
Starting point is 00:08:26 on that just to uh to demonstrate what it was and why it's gone and what it opens the door to no so that's fun it's a relatively small mailing but it's still a fair bit of things to to look at here yeah it's definitely good to see that they're able to, uh, still make progress on all of these with the, uh, you know, meeting that was supposed to be next month canceled. All right. Uh, Jason, what's this new one? You were, uh, going over some, some rust with your, uh, cousin. Yeah. I just thought that maybe our listeners would be interested in it because we got pretty good feedback from the episode with Jonathan. So he and I did a three hour live stream on my YouTube channel, um,
Starting point is 00:09:10 of, uh, just him teaching me rust. We started from the beginning, set up all the tools, install rust compiler and all that stuff. And, it's interesting,
Starting point is 00:09:18 um, to me, I guess it makes me a little sad in a way. Um, there were, uh, not, not very much response on the C plus plus Reddit for it.
Starting point is 00:09:29 There was about five comments, but four of them were negative. Hmm. Um, and on the rust Reddit where we posted it, we got much more positive responses. So it was like, why, you know, i don't know it just made me a little
Starting point is 00:09:47 sad that the c++ reddit wasn't nearly as uh welcoming as the rust reddit was for this but i'm curious what kind of negative comments you got but if you don't want to go into it that's fine yeah i don't see any reason to like go into specifics or Anyone could go and find the C++ Reddit on it. They were just kind of rude, and the Rust ones were like, oh, it's great to see these things from a C++ perspective or whatever. Yeah. Oh, well.
Starting point is 00:10:15 There's no harm in exploring other languages. I don't think you're going to go. You're not going to leave us and go start Rustcast and Rust Weekly, right? No. Okay. Okay, and then the last thing uh we have which i i think we've put in the show notes before is this list of um c++ events that have been postponed or uh delayed or canceled entirely due to the coronavirus um It's being updated as new announcements are made. So we just thought it would be worth putting in here again. It does also have a list of some of these user groups
Starting point is 00:10:53 that have gone virtual. Do you know if there's any specific events here that we should mention, Jason, that we maybe haven't mentioned before? Looks like CppCon, meeting C++ are still on in September and November, as of now. As of now, and CodeDive. And CodeDive.
Starting point is 00:11:13 What's happening with NDC TechTown? It's also September. TechTown has moved to online according to their website, but that doesn't seem to be on this list, is it? That's a shame. Yeah, it's not. That's why I'm asking. Yeah, TechTown has definitely moved online.
Starting point is 00:11:32 Their website has updated that. It's now a four-day online event. Yeah. Okay. So we'll keep everyone up to date on this if CBPCon, Meeting C++, or Code++ or code dive changes code dives a little bit smaller conference but it's it's a fun one i like going to poland there's also um microsoft you know i think we mentioned that they did their pure virtual c++ virtual conference a few weeks ago
Starting point is 00:12:01 and this week they're actually also hosting the Microsoft Build conference completely virtually. So I don't think there's too much C++ content in there, but I'm sure it'll all be available on YouTube or some other streaming platform once it's all over. So that's another thing that some listeners might want to go watch. Okay, so Martin,
Starting point is 00:12:20 we've obviously had Phil Nash on a couple times and talked about Catch. When did you take over maintenance of Catch 2? If you mean take over, as in I was a sole maintainer, it's two years ago, I think. Okay. And I started three and a half years ago, give or take a little bit.
Starting point is 00:12:41 Okay. And how did that process come about? How did you take the reins away from Phil? Did you wrestle him for it? Was it battle to the death? No. No. We had too many beers and one who got drunk first lost.
Starting point is 00:12:58 Okay. Actually, it started with a CPP cast episode with Phil on where he was talking about, I think, his plans for KH2, by which I mean at the time the next major version. And then I complained in the comments that it's sad that KH isn't really maintained anymore, because I fixed a couple of bugs for my own use, and the mesh necklace just was there for the last half year at the time. So Phil offered to give me the commit rights to help him with maintenance,
Starting point is 00:13:36 and then after about a year he stopped being active in the maintenance anymore, so I took over. So how's it been going? Mostly fine. I actually did retrospective at the start of the year, active in the maintenance anymore, so I took over. So how's it been going? Mostly fine. I actually did retrospective at the start of the year, and I think I averaged out something like an issue a day. Wow. So that's fun. Basically, back when I opened my own issue against Sketch, the number was 800-something.
Starting point is 00:14:06 Today, it's 1,900. That's 1,900 issues that have been resolved? Yeah, and merge requests and stuff. Mostly issues. Okay. So not 1,900 that are still open. No, 200 are open. Yeah, it's really difficult.
Starting point is 00:14:21 It used to be 300 before I started. Wow. Yeah, that's a lot. So what's it like? Is this your first experience being a maintainer of a popular open source library? Yeah. Sorry? No, go ahead.
Starting point is 00:14:36 It was pretty much like my first experience with anything of the sort. Okay. So what's it like? You just mentioned there's several hundred open issues still, which I know from having my own open source tools that that's difficult to keep the issues down. You know, this learning process of interacting with your users and trying to decide how, whose pull requests to accept and, and, you know, that kind of thing. Like, I don't know, do you have anything wisdom that you've learned over the last couple of years trying to decide whose pull request to accept and that kind of thing.
Starting point is 00:15:09 I don't know, do you have any wisdom that you've learned over the last couple of years regarding maintaining an open source project? Yeah, you have to realize that it's your free time, that you are investing into it, and you don't owe people to solve their issue right away. Especially when it's like, I have a two-year-old version of Sketch on this computer I can't upgrade, and my compiler is 10 years old, and there is a bug.
Starting point is 00:15:32 Okay. Yeah. There is a bug. You're like, sorry, but I can't help. Yeah. It's like, yes, you have GCC 4.4 on some obscure platform I have never heard of. You are on your own.
Starting point is 00:15:49 Help yourself. So just out of curiosity, do you ever, does anyone ever offer to, like, pay you to fix an issue or something like that? No? No. My company relies on this. We're using a 10-year-old compiler on an obscure platform.
Starting point is 00:16:06 Can you help us? If you pay me 100 euro. No. Okay, never mind. Doesn't GitHub have some sort of built-in, like, support your open source developers, Patreon-like thing now? There is, yeah.
Starting point is 00:16:20 Quite recent, like, I think this year. It actually is, like, two weeks since they opened it in the Czech Republic. Oh, okay. It's country by country, so if you weren't in the countries that were supported, you couldn't receive money. You could pay money, but you couldn't receive money. So we already mentioned that you took it over from Phil, but besides the fact that there's fewer open issues now and that you're closing issues,
Starting point is 00:16:46 is your style any different from Phil's as far as running the project? Yes. You said we can cut farts, right? You don't want to talk about it, that's fine. No, it's more like my basic rule for dealing with issues and feature requests is that I don't promise people things that I don't expect to be able to fulfill soon.
Starting point is 00:17:08 Right? Okay. If someone says, okay, I want this feature, and it's a feature I want as well, I will say, okay, that's something I would like to have. And unless I know that I will get to the feature soon, I will not say, like, this is going to happen next version or anything. Right. to the features soon, I will not say like this is going to happen next version or anything. Which Phil likes to go, well, we will offer support for property testing when Kedge didn't even have generators, right? So then after another year, the generators branch materialized, it was broken, then I fixed it and it has, so Kedge has generators, but that's not for testing.
Starting point is 00:17:46 It's just generating inputs, but you don't have the shrinking and verification. So you try to manage the user's expectations. Yeah. That's important. The reality is I don't have that much time for cache, right? So I don't promise things that I don't expect to be able to do. Okay, makes sense. I think a new version of catch2 was released recently. What were
Starting point is 00:18:12 some of the features added there? Right, a bunch of versions actually, but there was like the one version for the v2 branch, which is what people use right now, the single header one, and that had some, like, technically interesting changes. Basically, now if you shuffle the tests, you say, okay, I want to run all the tests in random order, and then you use the same feed, but say, I only want these five tests, then they will be in the same order, right? They will be randomly shuffled, but they will have the same relative order as they had when you had all the tests.
Starting point is 00:18:45 Okay. And it's actually useful, because when you find that there is some dependency between your tests, then you can say, okay, I will cut off half of the tests, and what does it do? Oh, it still reproduces, so I know these tests do not matter. So you can actually cut down which tests you have to run to find the dependency. And then there was the much bigger release, which is a preview of the next nature version, and that changes a lot of things, but the biggest change is that it will no longer be a single header library.
Starting point is 00:19:20 Now you will just include the parts that you want, and there is also a static parts that you want, and there is also a static library that you will compile for the implementation itself. See, this is interesting. I'm assuming Phil originally chose single-header library because that's easy to adopt. It's easy to convince people to install a single-header library. What convinced you to make this change
Starting point is 00:19:48 to not being a single header anymore? Basically, it's about future of Sketch. The problem with having single header is that you either do not add new features, you have to say, okay, these are the features that I support, there will be no more features, or you, you have to say, okay, these are the features that I support, there will be no more features. Or you have to have configuration macros for everything. So then you have 50 different configuration toggles to turn this off, turn this on, like to include matchers, include generators, include benchmarking, disable matchers, and so on and so on.
Starting point is 00:20:21 Or you have to accept that your library will have one megabyte as the header and it will take forever to compile. And if you actually look at the latest version of Ketch in the single header version, it already has like 600 kilobytes as a header. Right. And even the part that everyone includes, the one without the implementations, that has like 4 000 lines of code and includes a lot of stuff and a lot of standard headers. So basically the question is what do I want in the future for cache? And my answer was that I want more features and that does not work with single header. I want Catch to provide you measures that you actually can use and measures that people want
Starting point is 00:21:08 to use and generators and so on. But I already found myself saying okay, this is like a measure that is interesting. I can see that some people would use it, but merging it into Catch would mean that there is quite a lot of extra code and extra
Starting point is 00:21:24 includes from the Slank library library but if only one percent of people which is probably optimistic will use this measure everyone pays for it right everyone pays the compilation price but only one person which is really optimistic I think would use it so I had to say okay I am NOT bashing this because it doesn't work for catch okay Okay. Makes sense. So as we already talked about, you were a user of Catch before you started contributing and now maintaining the project. Do you still use it?
Starting point is 00:21:54 Is that, you know, we're using it as part of hobby projects or part of your day job? I use it for my day job. It's actually, I reported our code base, my code base at work, to the new version of Catch. I got something like 5% compile time improvements with the preview that I released recently, and actually like 10% improvement
Starting point is 00:22:18 with the version that's in Git right now. Wow. Which is, you know, nice. Yeah. Not like it dischanges everything, but it's nice. So I just saw a tweet from you this morning, I believe, unless I'm getting tweets confused,
Starting point is 00:22:31 where you said you had tried to use extern template to improve your build times with no success. Yes. Is that the header version? No, the new one. The split, let's call it split implementation version. And the problem is that it actually makes perfect sense when you figure out what's going
Starting point is 00:22:54 on, but it's unintuitive. Basically the problem is if you external template some class, you don't save yourself much because when you compile code, you use the class, the compiler still has to look into the class and instantiate most of it, right? It has to know how big is the class, what overloads does it have if you call some function on it or, well, member function and so on. So if you say, OK, I'm externing this template class, then if there isn't some like big, really big function, you don't get anything. It's unlikely that you will actually get some advantage from this. So if you want to external things, it's probably better to go with template functions, well,
Starting point is 00:23:35 function templates, because you can call functions without seeing their body, right? You don't need to know what's on stack in this function to call it. So you can say, OK, extern this function template, and then just call it. And that will probably save some time. It's on my to-do list to do some investigations about that. I think so. Maybe if I can digress a bit.
Starting point is 00:24:03 So the thing I was trying to extern was unique pointer of test registration class. Okay. Because I tried the Clang build analyzer and with the new Clang F time. Is it trace or report? F time report. And one of them does. I think it's F time report. Okay.
Starting point is 00:24:24 Yeah. So I tried Clang buildizer, and it told me, okay, so the template instantiation that takes the longest is std unique pointer for some cache internal class, the one that registers your test cases. And then there is another instantiation that takes just as long, and it's unique pointer implementation code, right? It's in some internal detail, and there are two or three more levels to this so i thought okay
Starting point is 00:24:49 i'm going to extend template this it did nothing and if i look at like the sum of all the unique pointer instantiation times it takes 10 of the time i spent compiling my tests. They're just instantiating unique pointers. Oh, wow. So if in the future you hear me going, okay, unique pointers are overrated, Catch no longer uses them, that's why. I just actually saw a project from someone that was showing how you can make
Starting point is 00:25:22 a lighter weight standards conformant unique pointer if you use you know c++ 17 features or whatever i don't know what they did different but i was looking at it and like yeah that's actually an interesting question because the standard versions of unique pointer are more complicated than you would think they need to be given the basic semantics of unique pointer and given what we all know it can compile down to in an optimized build. So it would be interesting to throw in your own light, you know, simple unique pointer and see what difference it makes. I might like, I don't have to support like all the things unique pointer does, right? I don't care that much for covariance.
Starting point is 00:26:06 I don't care for deleters. I just want to go delete. So I could really get rid of most of it. Right. Yeah, that's a good point. Yeah. Templated deleter. Yeah. I would like to return to
Starting point is 00:26:22 the Unix library a bit. My opinion is that it's really nice. I would like to return to the units library a bit. My opinion is that it's really nice. I would like to use it. But my experience with Chrono is that if you have a lot of code that uses Chrono and you try doing debug build, it will be like orders of magnitude
Starting point is 00:26:38 slower than if you have release build, which is a real problem. I have seen tests go from like 1 to 2 seconds to run to minutes because the difference between, oh yeah, actually optimize all this chrono type code down to something reasonable and just leave it in as it is. So that's my long-term misgivings about these libraries. It's interesting because there's been lots of articles recently about the,
Starting point is 00:27:06 well, Vittorio wrote an article also recently about the performance overhead that we get from the abstractions that we use in C++ in debug builds when we just have O0 turned on. And, yeah, it's an interesting point because in my doom experiments that i've been doing recently and i guess i don't think we even oh wait we have mentioned that on the show i you know the 13 hour doom stream that i did a debug build of doom o zero can still manage over 60 frames per second at 1080p and when i compile with o3 i'm at closer to 200 frames per second. So what am I talking about a three, a little over three times performance difference. And it's just using C code. It's not using any of
Starting point is 00:27:52 these abstraction layers that we are used to using and C++ be interesting to see how much slower I make the debug build potentially if I as I roll more C++ features into it. How do you feel about relying on O1 or OG or something like that in debug builds, Martin? I don't think there is a support for it in MSVC, so I don't feel anything about it. Ah, yeah, let's see. You would have to... Actually, I don't know whether Clang supports it either.
Starting point is 00:28:23 Clang supports OG, but last I looked, it was a synonym for O1. It's just to maintain command line. It doesn't really. Right. I want to interrupt the discussion for just a moment to bring you a word from our sponsor, PVS Studio. The company behind the PVS Studio Static Code Analyzer, which has proven itself in the search for errors, typos, and potential vulnerabilities. The tool supports the analysis of C, C++, C Sharp, and Java code.
Starting point is 00:28:51 The PVS Studio Analyzer is not only about diagnostic rules, but also about integration with such systems as SonarCube, Platform.io, Azure DevOps, Travis CI, CircleCI, GitLab CI, CD, Jenkins, Visual Studio, and more. However, the issue still remains, what can the analyzer do as compared to compilers? Therefore, the PVS Studio team occasionally checks compilers and writes notes about errors found in them. Recently, another article of this type was posted about checking the GCC 10 compiler. You can check out the link in the description of the podcast. Also, follow the link to the description of the podcast also follow the link to the pvs studio download page when requesting a license write the hashtag cppcast and receive a trial license not for one week but for a full month uh another tweet i recently saw from you martin which i was
Starting point is 00:29:38 curious about your twitter is coming back to haunt you now. Well, you commented on our episode with Billy O'Neill a few weeks ago talking about how it was from our discussion about suppressing warnings from included headers, I believe. And you talked about some awful, awful code that exists in Catch-2 to work around that. And I was curious if you could expand on that. Okay, so the first thing to know is how
Starting point is 00:30:07 catch works, right? How can catch decompose the expression? It works by magic. Sadly, no. So you write require A equals A equals B. That's what the user sees. And then after a couple rounds of
Starting point is 00:30:23 macro expansion, what you get is catch decomposer instantiate is less than the expression that you had originally. And the idea is that we insert some catch type that just overloaded all the comparison expressions on the
Starting point is 00:30:40 left, and because it evaluates left to right, then we basically change the type of the expression. This is okay, but there are some limitations of course. One is that this causes ODR use of whatever is in the expression which actually does change like this. I regularly get issues about code not compiling because sketch requires ODR use of what it gets. But the more important thing is that when we do this it kind of breaks literals, integer literals most of all. If you have like some C code or C++ code and you write
Starting point is 00:31:20 if some a variable that's of type unsigned int equals equals 2, then you will not get warning, because you know unsigned int and some integer literal that's positive, that's okay, that's fine, but if you write require a equals the same variable a equals equals
Starting point is 00:31:39 2, then what the actual expression will be is that the left hand side remains unsighted but the right hand side literally becomes an int and then when we actually do the comparison the compiler would complain because suddenly there are different signs so we have to disable this warning or it will be too much noise so how do you get the warning back?
Starting point is 00:32:05 Well, what we do right now is that we have some really terrible expression which is something like all of the required macro is enclosed in a do while loop, right? And the while
Starting point is 00:32:22 condition is 0, 0, and n no, sorry, condition is 0, 0, and n no, sorry, 0, comma, false, and n static cast bool of double negation of the expression. And what this does
Starting point is 00:32:37 is that actually this is unevaluated, right? It will not be evaluated because it's behind a short-circuiting logical AND, which starts with a false. So obviously it will never actually happen. But MSVC will look into the expression and issue warnings. Okay. So that's MSVC. And old clanks, like Clank 3.8 and earlier, actually also did this. Modern
Starting point is 00:33:03 clanks do not, and GCC doesn't either. So what happens for GCC and clang and there is this like a big footnote that says not all versions of clang because a lot of compilers report themselves as clang is that we actually use a built-in const no what is it called? Some built-in that basically says, ask the compiler is this constant expression yes or no? But we don't care, so we just ask the return to void. But the compiler will actually take a look
Starting point is 00:33:34 at the expression and complain if it sees some warnings in the expression. Okay. And of course then someone opens up an issue saying, and we have this IBM Excel compiler which reports itself as Clang, but if you give it an expression inside this built-in, it will generate these destructors.
Starting point is 00:33:53 It will generate calls to destructors. Nothing else. It won't construct the object, it will just destruct them. What do you think happens when you type in a destructor and this happens? So there's actually like another... Okay, and if it's like IBM Excel, then don't do this. Just don't avoid anything. No warnings for you.
Starting point is 00:34:15 And this is not all, because if you do this, Clang-Tidy complains. Because Clang-Tidy goes, oh, this is some built-in that I think it's a variadic function, right? So it will tell you, but you are using variadic functions. Don't use variadic functions. Variadic functions are bad. So there is also nodes to the ClinkTidy linter saint node. I know what I'm doing. Don't warn about variadic expressions.
Starting point is 00:34:44 Well, calls to heretic functions here so that's fun there is also some extra fun because ClangTide has aliases so some warnings are aliases of different warnings, right?
Starting point is 00:35:00 so there are actually two warnings that do the same thing but have different names for the heretic functions and if you silence one of them it will not silence the other So there are actually two warnings that do the same thing, but have different names for the variadic functions. And if you silence one of them, it will not silence the other. So I actually am suppressing both of the warnings there. And I opened an issue with Clang Tidy because this is not how it should behave. No, yeah, I saw you tweet about that too. I totally agree. Yeah, suppressing one should absolutely suppress
Starting point is 00:35:25 all the aliases as well that's um so how much of those uh that kind of thing that you just described will you be able to remove when you're able to to make the split version of catch no no okay oh really like yeah there isn't anything I can do with this, right? All of this is just to handle taking an expression and actually being able to take the expression apart. There is nothing here that I can help by not having more or less headers or less code. All of this logic just has to remain the same.
Starting point is 00:36:04 That's unfortunate. It's not even the most terrible part of our macros. We have worse. It's an interesting point. I was having a conversation with a friend recently about, yeah, sometimes, like I aim for simple C++ code. That's my goal. Sometimes you need the complexity to make the
Starting point is 00:36:27 API simple, and then you try to hide that complexity. So thank you for hiding the complexity of catch so that I can use a nice, clean, simple API in my code. Yeah. So, you know, we mentioned in your bio that you're currently working on converting master key systems to sat uh could we maybe go into that a little bit and tell us what sats are okay i'll start with the master key systems sure first so it's like very simple like when you have an office building then you want usually usually you want different keys to have different access rights. Right?
Starting point is 00:37:11 So there is a key that opens all the doors, a key that opens everything on a specific floor. I don't know, a key that opens the main entrance and then it opens the electrical cabinet, but doesn't let people onto the floors and so on. So that's called the master key system. You have some relation between the keys and locks and they can open each other or they can be blocked. The key can be blocked in the lock so it can't open it. And what we do is that we take these systems together with a bunch of other rules,
Starting point is 00:37:43 like how many positions are there on a key, how many cutting depths are there on the key, any constraints on the manufacturing. You can't cut too deep and not as deep on the next position because you
Starting point is 00:38:00 are making weak cuts. If you cut deep enough, you already cut off the material from the positions next to it. So we're talking about physical keys. I just want to make sure that's clear. Like brass keys. Okay. And it has a bunch of rules,
Starting point is 00:38:17 like the rules that, you know, we can't actually make the keys if these rules aren't there, and there are rules for security reasons and so on. And so we make software that takes all these rules and then gives back the cutting codes for the key, which is like, okay, so you cut this position this deep, this position this deep, and so on. And the implementation detail of it is that we actually take the system, we convert it to large logical formula, and then we just throw it into a sub-solver. We were rather well used to do that a year and a half ago.
Starting point is 00:38:59 Now it's a lot smarter, but we still use the basic idea of describing all the rules and the keys and the logs and the relations into the formula and then just telling some sub-solver to go and solve it for us. Okay. And this is kind of similar to scheduling all the students in a classroom or whatever kind of problem? It sounds like to me. Yes, kind of.
Starting point is 00:39:27 Okay. It's like the same computational complexity. Okay. You use different techniques for both. So that's what I do. Huh. So a SAT solver is a constraint solver? No.
Starting point is 00:39:46 Well, yes, but no. Okay. Because the reason why I'm making the distinction is that they are actually like constraint solvers which use a richer language to describe the problems.
Starting point is 00:40:01 If you are working with a constraint solver, you can say, okay, and these five variables all have to have the problems. If you are working with a constraint solver, you can say, okay, and these five variables all have to have different values. If you are working on the level of SAT solver, you have to say, well, you just give it large logical formula in CNF form. So it's like a conjunction
Starting point is 00:40:17 of many, many, many, many, many, many disjunctions. Okay. So you end up with like 10 million variables, a couple hundred million clauses, and just tell the software, okay, go solve it. So you're doing this in C++? Yes.
Starting point is 00:40:39 Are you passing... Sorry, brain, I have to take a step back back here one of the projects that i'm involved in um does like describes the problem in a dsl which then generates c++ code which they then compile and run so that they can get the best throughput possible that's's the theory. It's a project that's currently in the works. And to just complicate things, the actual solver that generates C++ code is an open source project that's written in Java. So it just makes the whole thing. But the question is, are you doing something like that?
Starting point is 00:41:20 Or is it like a library? Well, we actually used to write out, okay, maybe step back. There is a format that's used by sub-solvers that pretty much every sub-solver understands. It's called Dimex and like the first prototype many years ago actually just wrote out Dimex to standard out and a different like binary just read theIMAX, solve it, wrote back the solution, and so on. And nowadays we use it as a C++ library.
Starting point is 00:41:51 Are there lots of different libraries available for writing SAT solvers? Or are you writing a lot of this code by hand? There's a bunch of them. There is a bunch of SAT sol There is a bunch of sub-solvers and you just use
Starting point is 00:42:08 their API bindings directly usually. I think I mentioned two or three sub-solvers in my CPPCon talk. Here are some options. You'll use one of them. I'll have to make sure to put links to those in our show notes.
Starting point is 00:42:25 Jason, you want to ask about the random blog post? Sure. You recently posted a blog post on the issues with standard random, so I'm curious what issues you found, if you could describe that for our listeners. Yes.
Starting point is 00:42:43 Okay, so disclaimer first, I didn't find them. I was just the latest person crazy enough to try and standardize some fixes. Okay, the latest person. How I like to describe the problem is that std random doesn't solve anyone's problem, right? There are people who are beginners. They just want some random integer and do something with it. They don't care about portability, quality of the integer. They just want an integer. And these people are not served by random because, like, have you seen the interface?
Starting point is 00:43:23 It does take a few lines of boilerplate to get a number. Yeah, a few lines. I just want an integer, right? I don't want to go and figure out whether I should, like, have std mt19937, or maybe I want, I think, luke's around 24, 48? I don't know. Well, I mean, I do, but if you are a beginner,
Starting point is 00:43:47 this is a lot of complexity you don't want to deal with. And then you have to also seed the random number engine and so on. So, obviously, it's aimed at experts, if you look at the interface, but the problem is it doesn't support experts, right? You can't, like, if you need a portable, like, if you want to generate the same numbers using the same seed on different platforms, you can't use two random.
Starting point is 00:44:14 And this is not an obscure problem, right? If you've ever played something like Minecraft or Terraria or anything that does procedural generation from seed, you probably want this game to generate the same world from the same seed, whether it's running on Windows, Linux, or Mac, or, you know, whatever. So why does it... I'm going to interrupt you. If I'm using the Mersenne Twister that you alluded to, whatever, MT1, 3, whatever,
Starting point is 00:44:40 and I seed it the same and I use the same distribution, why would I not get the same results on each platform? Because every platform implements the distribution differently, so you get different results, right? Oh. The standard intentionally, and I think this was the wrong decision, but intentionally doesn't specify how the distributions are implemented. So every standard library implementation uses different implementations for their
Starting point is 00:45:08 distributions, which in turn means that you get different results. And of course you don't want to use the MERS Ingestor directly, because then you have the bias issues that we all know about, and they are the reason why we have something like better than StutRamp. Okay., okay, so this is like the first problem. And, okay, but, you know, the design of the stutrandom is that you can like write your own distribution and just plug it on top of the existing random number engines, right?
Starting point is 00:45:38 Cool. But then you find out that every other layer in the stutrandom is also flawed, so you can't use it either. It's like, okay, so we have two random device the idea is that this is some uh this is some way to get some randomness like this is an abstraction over platform specific randomness right sounds good so what does the standard say well the standard says it should provide random bytes, or it might not. Just implement it with some random number engine, which is deterministic, whatever.
Starting point is 00:46:10 You can do that. And then it provides this member function called entropy. And in theory, you could call entropy and just go, OK, it tells me it doesn't have any entropy. It can't be random, right? If there is no entropy, it must mean that the output is predetermined. Well, there is a problem. If you use libc++, it will never not return zero. It's just hard-coded, okay, return zero, but okay, so where does it get its random bytes? Well, it can either read defrandom, defurandom, nacl, securerandom,
Starting point is 00:46:46 rand, and basically it uses a bunch of backends and all of them are secure, right? They all have different entropy, yeah. They all should have like full entropy. They are like platform specific, strong random functions. A bunch of them are even cryptographically secure and do provide that guarantee. So this approach doesn't work. And it's still better than what the libc2dc++ does, which actually has even more possible backends. It has six different backends from which it can actually read the random data, and it either returns 0 as a constant
Starting point is 00:47:26 for most of the backends, or it will actually ask the kernel how much entropy does the kernel think it provides. This is the configuration which reads from dev random or dev u random. There is of course this obvious problem, right? This is like
Starting point is 00:47:42 asking, does this file exit before opening it? You don't do that. You just open the file, and if it fails, you handle the error. Because if you ask first and then try to open the file, you have a race. You can find out that this is nice, and the file existed,
Starting point is 00:47:58 and it doesn't exist anymore. So you shouldn't ask, just open. And this is the same problem, right? Is there an entropy? There is. Cool. Let's give it to me. Is it still there? I don't know. And also there the same problem, right? Is there an entropy? There is, cool, let's give it to me. Is it still there? I don't know. And also there is like this massive digression because I don't actually believe in depletable entropy the way the Linux kernel does. And it's my... like I didn't make any like real studies, but most people that I talked with in security context says that you do not
Starting point is 00:48:26 lose the entropy. Once you initialize the kernel, it initializes its cryptographically secure random number generator that derives more random data from what it has. You can't deplete the entropy pool anymore.
Starting point is 00:48:42 Apart from the fact that the entropy can't work because it's not like API that can work because of races, it's also useless. And MSVC is like the one platform that actually implements it properly.
Starting point is 00:48:57 It just always returns 32, like the full possible entropy for 4 bytes, and they just call into the kernel provided cryptographic secure random number generation. So one option that's guaranteed to be secure and with okay so yeah but you will
Starting point is 00:49:14 basically have to either like start hard coding some list of platforms which work for you like you can say okay so if I'm on Windows and my libstdc++ version is less than seven then it uses Merchant Wrister, or it might be more, in which case it should use Runtest, or you have to write your own random device, right?
Starting point is 00:49:33 Because at some point, this is less work than just going through the different implementations and saying, yes, this version works. No, this version doesn't. And then you have to- All right, so what's your solution? My solution for this is like deprecate entropy. I would just remove it, but
Starting point is 00:49:50 you can't remove things from the standard, so deprecate it. And my preferred solution then would be to just say, okay, if you implement two random devices, it has to provide cryptographically secured random bytes. This will never pass, so the solution that can actually get into
Starting point is 00:50:05 standards is to provide two queries. One of them is, is this really random? Or is this just some wrapper over some pseudo random number generator like std merce and istr like libstd C++ does? Well, it did on Windows. And the second query is, is this actually a cryptographically secure random data generator? Because this is what people care about, right? And entropy doesn't do anything for them. Right. And then there is also a lot of issues with seeding the generators,
Starting point is 00:50:40 but let's not go over that. Okay. I think we might be running out of time to go over that. Yeah. Well, before we go, were you able to present this paper at the last ISO meeting? Did you get any good feedback? Yeah, I presented all three papers. The feedback was, for the random device, it was basically that I should write the version that I just said and present it again.
Starting point is 00:51:03 Okay. Okay. Uh, the, okay. Uh, I'm not sure which of the other two papers was which, but basically one, uh, one paper didn't have consensus to have consensus.
Starting point is 00:51:14 Right. There was like, uh, basically the result of voting was this. Right. Okay. Yeah. Okay.
Starting point is 00:51:25 And the third one, I think it was like weakly positive feedback, but nothing really interesting. So I will have to rewrite them and try again. I will have to be more persuasive. So this might sound silly,
Starting point is 00:51:43 but I will say the last few times that I've personally gone to use std random, I have not been able to, not for any of the reasons that you said, but because I needed constexpr. I mean, if I give it a static seed and I want to generate the same set of generatable data. Yeah, there is no reason for this to not work, but someone has to write the paper. Right, and so I just have a constexpr random implementation. I just pull around.
Starting point is 00:52:13 Yeah, sure. But yeah, there is no reason not to do it. You don't allocate memory, you just do some computations. No, I'm pretty sure that the current std random except for the random device could be just slap constexpr on it, basically.
Starting point is 00:52:30 Yes. Well, Martin, it's been great having you on the show today. Anything else you wanted to go over or put in the show notes? We'll definitely find the papers that you're working on and some of your CppCon talks. No, I think that's all of it. Unless you
Starting point is 00:52:46 want to hear me rant more about stdrandom. I have many more words about stdrandom. Maybe in a future episode after you get more feedback from your papers on it. Yeah, I think that's a good idea. Okay, thanks Martin.
Starting point is 00:53:02 Thanks for going on. Thank you. Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, we'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help support the show through Patreon.
Starting point is 00:53:38 If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode was provided by podcastthemes.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.