CppCast - Provable Functions at CppCon 2019

Episode Date: September 19, 2019

Rob and Jason are joined by Lisa Lippincott in this special episode recorded at CppCon. They first discuss some of the conference highlights and favorite talks so far. Then Lisa gives an overview of h...er 'Truth of a Procedure' talk. Later they talk about Lisa's work on the ISO committee, her thoughts on Contracts and much more. Links C++Now 2019: Lisa Lippincott "The Truth of a Procedure" C++Now 2018: Lisa Lippincott "The Shape of a Program" Layout-compatibility and Pointer-interconvertibility Traits Signed Integers are Two's Complement Sponsors Backtrace Announcing Visual Studio Extension - Integrated Crash Reporting in 5 Minutes JetBrains

Transcript
Discussion (0)
Starting point is 00:00:00 Episode 215 of CppCast with guest Lisa Lippincott, recorded September 19th, 2019. This episode of CppCast is sponsored by Backtrace, the only cross-platform crash reporting solution that automates the manual effort out of debugging. Get the context you need to resolve crashes in one interface for Linux, Windows, mobile, and gaming platforms. Check out their new Visual Studio extension for C++ and claim your free trial at backtrace.io.cppcast. And by JetBrains, makers of smart IDEs to simplify your challenging tasks and automate the routine ones. Exclusively for CppCast, JetBrains is offering a 25% discount for a yearly individual license, new or update, on the C++ tool of your choice, CLion or Sharper C++ or AppCode. Use the coupon code JETBRAINS for CppCast during checkout at www.Con to discuss the conference.
Starting point is 00:01:14 Then we talk to Lisa Lippincott. Lisa tells us about her CPPCon talk and her work in the ISO committee, where she recently became the Numeric Study Group Chair. Welcome to episode 215 of CBPCast, the first podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how's it going today? I'm all right, Rob. Long CBPCon week. Yeah, it's day four of CBPCon. I think the conference is going... Yeah, it's day four.
Starting point is 00:02:04 No video, so people can't uh see me counting with my fingers here yeah uh yeah so it's been a long conference but an enjoyable one um any highlights for you you wanted to mention oh my goodness i have at the moment i think 49 tabs open of things that i need to go back and look at that doesn't include notes or yeah i don't even know i have a lot of stuff i need to go back and research yeah well it's been great for me meeting lots of listeners lots of our previous guests um you know normally at this point in the show we would call out a piece of feedback, a tweet or something. I did want to mention one listener came up to me and said every single time he listens to the podcast, he hopes that his tweet is going to be read on the show, even when he's listening to ones from before he started listening. So this is, I guess, me calling out his piece of feedback.
Starting point is 00:03:01 I did have one listener tell me that uh i sound really weird speaking at half speed so i told them how to try really hard to speak faster my nice talk that i'm going to give perfect okay well uh joining us today is lisa lippincott uh lisa designed the software architectures of tanium and big fix two systems for managing large fleets of computers she's also a language nerd and has contributed to arcane parts of the c++ standard in her spare time she studies mathematical logic and wants to make computer checked proofs of correctness a routine part of programming lisa welcome to the show thank you okay so um you you like to study math logic for fun i do it is my first love.
Starting point is 00:03:50 I went to graduate school in math, but ended up doing programming because it's easier to get a programming job. It's actually interesting to me that you say that because I applied to graduate schools and my wife came with me and she already at the time had an undergrad in pure math and a master's in applied math. And those university departments are trying to recruit her into the computer science department instead of me. And I was the only one who was actually interested. Okay.
Starting point is 00:04:12 Well, Lisa, we want to talk more about your history with C++. But maybe we could start off just asking you how much you're enjoying the conference this week. Oh, I'm having a good time. I mean, getting up in the morning after the late nights is probably the, you know, that's the tough part. It starts early and can end quite late. But apart from lack of sleep, I'm having a great time. I'm learning things. I'm talking to lots of interesting people. There's good talks. With the new venue, is there anything particular that stands out to you as being different, better, or whatever compared to Bellevue?
Starting point is 00:04:56 Okay, I can't say I spotted it myself. But it's been a few years since I've been to CppCon, so I'd kind of forgotten about the Bellevue venue. But being able to move around in the halls without squeezing past people is really nice. There's definitely a lot more wiggle room in these places. Yeah, that's right. During the breaks in Bellevue, we would get awfully close and personal. Yeah, that's right. During the breaks in Bellevue, we would get awfully close and personal. Yeah.
Starting point is 00:05:27 On the flip side of that is there's people I know who are at the conference and I haven't seen them yet. Wow. Yeah. Any particular highlights for you with talks or some of the keynotes? I quite liked there was a talk on allocator-aware data structures that it was Alistair Meredith and, oh man, my memory is fading. But it was a very nice talk and an interesting scheme that I hope will work. So plans are afoot to do something better with allocators.
Starting point is 00:06:12 And Sean Parent gave a very nice plenary talk today. Yes, he did. Yeah, it was very focused on good quality code, kind of. I like good quality code. Yes, I like good quality, yeah. It does seem that there are some people
Starting point is 00:06:28 who aren't as big of fans of good quality code as they could be. I liked how Sean apparently seemed to be actively trying to kind of reinvent himself and coin some new catchphrases, maybe, trying to get away from, it's a rotate. He was,ases maybe trying to get away from it's a rotate he was yes uh trying to get away from the also the uh no raw loops he says what did he say i have no i'd like uncountably many times or something that he had been quoted on that and i think i've actually
Starting point is 00:06:58 quoted him on it twice myself and my talks already this week. Well, and particularly after seeing Connor's talk, that in, you know, Connor delivered a rather meta talk that was significantly referencing and showing clips of other people's talks. And he does a much better job of building on the embodied knowledge that is now i guess on youtube um wish i had seen that one actually i wasn't able to make it to it yeah i haven't made
Starting point is 00:07:33 to connor's talk either but i did get a chance to meet connor so it was uh i'll have to follow up and actually watch that talk definitely recommend that one And there are many clips of Sean Parent in it. Okay. Well, just before we sat down to talk, Lisa, you actually gave one of your own talks, The Truth of a Procedure. Do you want to tell us a little bit about how that talk went? Oh, so I think it was well received. Few people were confused. I try not to, but I do sometimes end up confusing a few people. But I think most of the people got it. It being the idea that you can look at a procedure in a computer program,
Starting point is 00:08:26 what we call in C++ a function, but not what we call in math a function, and look at it as a sentence that can be either true or false. It can describe the ways events can be arranged in the world, or it could describe ways in which events cannot be arranged in the world, or it could describe ways in which events cannot be arranged in the world. And how it is better if your procedure describes arrangements of events that can actually happen. That does sound good, yes. Yes. Undefined behavior being a thing that happens when the computer cannot arrange events
Starting point is 00:09:08 the way your procedure suggests they should be arranged okay one of the things i did make it to your talk and one of the things you talked about early on was uh that you showed a lot of code or pseudo code and you mentioned how uh he was using fantasy c++ extensions is what you referred to it yes um do you think there is a place for some of these extensions to actually be introduced in the language or is it more just for the purposes of your talks uh if i were ruler of c++ i would introduce these things into c++ you're on the committee i i i am on the committee but there are a lot of other people on the committee and they also have their own ideas um and some of their ideas may be better than mine i don't want to say my ideas are necessarily the the best ideas. But in some sense, my purpose on the committee is to push C++ in this direction.
Starting point is 00:10:12 So I missed the talk. What are we talking like annotations or new keywords or? So in my fantasy C++, I am fairly broad with, you know, I am willing to create new keywords because fantasy C++ doesn't need backward compatibility. But there are a couple of new, primarily one new set, um, um, statement type. There's, um, a new thing that is a new, um, function return type that isn't exactly a type, but I think that you put in the place where a function return type goes. And, but probably the biggest is a notation for function interfaces that acts as an intermediary, an intermediate block of code that can sit between the caller and the implementation of the function, and that judges the fitness of the parameters and results that go through it. So expressing what many on the committee would prefer to call the contract, I prefer function
Starting point is 00:11:35 interface, as a block of code that sits sort of like, a little bit like, an intermediate function between the caller and the implementation. Other languages have something like this, right, where you can execute code arbitrarily before a function and arbitrarily after a function to set up preconditions? There are languages that have this sort of thing. I don't know of any that have systematically used this for preconditions and postconditions, but there are many languages I don't know. So how does this relate then to the contracts that we didn't get? Okay. If I had my way, and I caution people to not assume I will have my way on this.
Starting point is 00:12:29 Anything that leads to more correct code, you can do your way. But if I had my way, instead of putting attributes on function declarations for pre and post, which I think was the last spelling just before things got yanked out. And instead of doing that with attributes, I would prefer to have an interface block that you can think of as executing in between. That is, the function interface looks rather like a function, but has an implementation statement that marks the point where the implementation is executed. And so the implementation statement divides the function interface into a prologue and an epilogue. And any assertions you make in the prologue are preconditions,
Starting point is 00:13:26 and any assertions you make in the epilogue are postconditions. Okay. So the difference between pre and postcondition is just where in the code it appears. Well, I guess as Jason mentioned, we took contracts out of C++20, and now there's a new study group to reevaluate contracts and I guess rewrite those proposals. I will be going to now there's a new study group to re-evaluate contracts and, I guess, rewrite. I will be going to the meetings of the new study group. That's what I was going to ask. I mean, are you going to try to put your mark on that study group and try to influence the direction at all?
Starting point is 00:13:55 Yes, but there are a lot of strong personalities in that study group. I'm sure that's true, Given the history of contracts that were, as we're aware of from the outside view. Yeah. So does this, this all directly ties in with your desire to make computer checked proofs of correctness for our programming? Is that an accurate statement?
Starting point is 00:14:20 Yes, that is absolutely accurate. So, and this does create, you know, some of the contention that we have that, in a certain sense, I'm aiming for a different goal than other people are with contracts. And, you know, to some degree, it's a little unfair of me to try and take over this language mechanism that people have been working on for a long time and try to push it for my use, which is, I want to get us to a point where we can routinely do computer-checked proofs that our functions will, in fact, obey our contracts. Are all functions in the C++ sense, I guess, not in the math sense, are they all provable, checkable? Like, what are the limitations?
Starting point is 00:15:16 So that's actually the content of the talk I just gave, that there is a theorem, and this is really just Gödel's completeness theorem, that if a function is going to behave correctly in every possible world, given the conditions that it lays out explicitly, its preconditions and the post-conditions of things that it calls, if given all of that stuff is satisfied, your function will work in every possible world, then in fact, there will be a way to turn the function into a proof by adding in the aspects of the world that, you know, the rules about the world that your function relies on. That is, you know, to sum it up, it is if something is necessarily true, then it is provable. And then this comes back around to you saying the premise of the talk was that all functions should be able to become a sentence that is either true or false. Well, all functions are inherently can be read as sentences that are either true or false.
Starting point is 00:16:47 Putting function interfaces on our functions lets us write much more interesting sentences. together across calls because the function interface is where the function overlaps with its caller or alternatively with the functions that it calls. Last year, I did a talk not at this conference, but at others called the shape of a program where I talked about this from a topological point of view. I believe I saw that talk at C++ Now. Yes, I gave that at C++ Now last year. Okay. And that's more about how the functions come together matching on their overlapping areas to make a larger program.
Starting point is 00:17:40 Okay. In the way mathematicians refer to as manifolds. Sounds like another M word. We joke about monad on the show here, and I jokingly say I refuse to learn what a monad is, but someday I'll roll this out. In that case, I won't try to explain what a monad is. I had to, in graduate school, I studied category theory, but I had almost exclusively
Starting point is 00:18:10 studied co-monads instead of monads. And so I had to learn how to turn my thinking around to look at them the way computer scientists started looking at them later. I've never even heard of co-monads. I'm afraid to learn more about them, though. Is it like a co-routine, though? Like they relate to each other in some way? It is the same co. No, I'm just kidding.
Starting point is 00:18:41 It is a kind of turning things around backwards, which is exactly how routines relate to co-routines. But a monad and a co-monad can be seen as the two ends of an adjunction. And an adjunction is just a metaphor. When I say that love is like a snowmobile racing across the icy tundra, what I'm saying is that reasoning about snowmobiles applies to love. And if that's a good metaphor, if that metaphor works in a nice uniform way, then there's going to be a monad about snowmobiles. I get this backwards sometimes. There's going to be a monad about snowmobiles and a comonad about love. And those will define the endpoints of the metaphor. And likewise, if you want to say that lists are like this data structure that I've written,
Starting point is 00:19:40 then there will be a comonad about lists and a monad about the data structure you've written that explain the ends of the metaphor. Sure. Yeah, that went kind of fast. It's okay. I wanted to interrupt the discussion for just a moment to talk about the sponsor of this episode of CppCast, the PVS Studio team. The team promotes the practice of writing high quality code, as well as the methodology of static code analysis. In their blog, you'll find many articles on programming,
Starting point is 00:20:09 code security, checks of open-source projects, and much more. For example, they've recently posted an article which demonstrates not in theory, but in practice, that many pull requests on GitHub related to bug fixing could have been avoided if code authors regularly use static code analysis. Speaking of which, another recent article shows how to set up regular runs of the PVS Studio static code analyzer on Travis CI. Links to these publications are in the show notes for this episode. Try PVS Studio. The tool will help you find bugs and potential vulnerabilities in the code of programs written in C, C++, C Sharp, and Java. So on Monday, you were also on a panel of committee members.
Starting point is 00:20:54 It used to be called Grill the Committee, but then we decided we're no longer going to grill committee members and we'd have a fireside chat with them instead. So can we talk a little bit more about your committee work? You are the chair of SG6, is that right? Just recently, I am the chair of SG6, is that right? Just recently I am the the chair of SG6, the numerics group. Okay. The previous chair has, was needed for other work, so leaving an opening and I just recently accepted that and I haven't actually chaired a meeting yet. Oh, okay. So what does the numeric study group do? Well, okay, the numeric study group is the first line of vetting for proposals for the language
Starting point is 00:21:36 that impact how we deal with numbers. Oh, okay. So it's a very minor role. meets for a day, a day and a half at the meetings, and there's rarely a lot of contention over, you know, there are, you know, some things in the pipe that there's going to be design work on and deciding which way to go. And, you know, probably the biggest thing that went through the numerics group in recent memory was we decided after some investigation of all of the computers that are running modern C++ compilers that we can make a determination that all C++ implementations will use,
Starting point is 00:22:42 choose complement integers. And that went to the numerics group because numbers yes or at least i don't know if it started with him but he was one of the principal forces um and that got accepted for c++20? That changes in 20. How does that actually change the way we interact with the language? Does that now mean that signed integer overflow is well-defined? No, it doesn't mean that signed integer overflow is well-defined, but it does mean that the range of a assigned integer is necessarily from minus two to the n to positive two to the n minus one, if you get the parentheses from the right place. Where n is one less than the number of
Starting point is 00:23:37 bits of the not the number of bits of storage necessarily, but that's... But anyway, it gives you one more guarantee that you get about what integers are going to be like. And in practice, an awful lot of people are programming on computers where they already knew, yeah, for my system, it's going to be two's complement. In fact, we were unable to find a computer that had a modern C++ compiler that did anything else. So what we are doing now is in the language guaranteeing that that thing that was true anyway will in fact be true. So do you expect that this will affect code generation at all in any way, or is it just codifying something that... No, it won't affect code generation because each individual compiler knew the architecture it was compiling for and knew whether its integers were two's complement.
Starting point is 00:24:47 What it does is if you are trying to carefully write portable code, you don't have to check the weird edge cases that come with different, you know, with say a ones complement integer. And before this, it was legal, nobody did it, but it was legal to have a signed integer type where it only went down to minus one, even though it went up to 2000 or something. And these very asymmetric signed integer types are just weird and nobody wants them anyway. I don't believe I've ever used a computer at all that wasn't two's complement even going back to 6502. No, I don't think you're old enough. I don't think I have either. I'm not absolutely certain that I haven't. If I think back to the very earliest computers that I've used, there's a possibility that one of them might have been a one's compliment machine. Do you mind if I ask what your earliest computers were?
Starting point is 00:25:59 We didn't ask that at all. Okay, so the first computer I ever programmed was a Wang 2200 by the name of Herbie. Computers were rare enough back then. We named them. I mean, now I think my desktop computer is just called Lisa's Desktop or something. But there was a time when we were intimate enough with our computers and with few enough of them that we actually gave them names. What kind of, I'm not familiar with 2200 at all, like what kind of storage or processing capability did it have? Well, I was just starting out and I didn't completely understand.
Starting point is 00:26:55 But it was the computer that our high school made available to students. To all of the students, there was one computer that we shared. And this was not a time sharing system. This was you signed up your name on the schedule for using the computer to actually sit in front of the console and and that's what you did and you programmed it in basic and it had a mark sense card reader and um and that's what I learned. And five and a quarter inch floppy drives. Five and a quarter? Okay. Floppy drives where they actually flopped.
Starting point is 00:27:32 I was expecting you to say the eight inch floppies, which I didn't. I definitely did not go back that far myself. I did work a little bit on a computer that had eight inch floppies that was also donated to my high school later. But it was something of an antique even then. Yeah, I think I've literally seen an 8-inch floppy disk once.
Starting point is 00:27:55 I'm not sure if I have. Definitely seen the 5-inch floppy disk. You're too young. We can just play that game. You can tell me I'm too young, I can tell Rob he's too young. Yes, the antique computer with the 8-inch floppies was actually rather convenient because not very many people knew how to use it,
Starting point is 00:28:09 and so it was much easier to sit in front of it. That's a fun way to do it. The computer no one else wants to use. It can give you a lot more time, for real. Okay, so we talked a little bit about the Two's Complement paper. Were there any other papers that made it out of SG6 into C++20? Certainly there were. I am not coming up with any off the top of my head,
Starting point is 00:28:39 but I can tell you about the one that I am, that did get, there was one of mine that got, you know, first level of approval for C++23 at the last meeting. Which is, I'm proposing a rather large set of function templates that are called things like can add. And can add takes two numbers and tells you whether you will have defined behavior if you add those two numbers, if you put a plus between those two numbers, and whether or not you will actually get the mathematical sum of those two numbers if you add them. So I'm proposing functions that correspond to all the arithmetic operations that just test to see whether you can do that and will get the right answer. For instance, they will test for overflow. But you have to have compile time constants, like these two specific values.
Starting point is 00:29:48 No, these aren't traits. These are functions. It's a function that takes two numbers and says true if you can add them together without overflow. And it says false if you can't. I would love to see a corollary to that that has one that then gives you the overflow for the cases that I want that. Yes, there are some plans in the works for something like that. I can't off the top of my head say exactly what those are, but that's actually the sort of work that Lawrence retired as the previous, Lawrence Crowell retired as the previous SG6 chair in order to work on editing
Starting point is 00:30:36 those ideas into a document. So this, you know, this paper was just cutting off a tiny piece of the problem that was well-defined and we can, you know, could move directly. That's interesting. Yeah. And it was particularly needed for, you know, particularly with contracts we were hoping to come along, we're going to need to say what the contract for the basic, what the contracts corresponding to the basic arithmetic operations are. Okay. And so these functions can be used to express those contracts in a nice concise way,
Starting point is 00:31:21 instead of remembering how to do the check that I can multiply these things together without overflow and not during the check actually accidentally overflow because that would be bad. Yes, I like the idea. Yeah. And you also did get a paper into C++20 yourself, right? Yes. This one took a surprising amount of time. It was actually the first paper I thought I would get passed by the committee when I introduced it more than three years ago. But it took a surprisingly windy path through the subgroups of the committee. And in the end, every major subgroup looked at this paper, and some of them more than once, most of them more than once. So what's the title of this one? The title is, I think, Traits for Layout Compatibility and Po pointer interconvertibility.
Starting point is 00:32:26 And layout compatibility and pointer interconvertibility are the two things that let you, or are two of the things, there's a third, but these are two ways to do type punning in C++, and both of them interact with C in certain ways. Layout compatibility governs type punning through unions, and pointer interconvertibility governs type punning through reinterpret cast. And in both cases, they only work sometimes. And there's a fairly narrow set of situation for each that you need to be in write a test to see if you were in the situation where your type
Starting point is 00:33:29 punning trick would work. So that was the goal here, was to introduce a way to write an assertion, ideally a static assertion. I don't think that there's a great deal of need for this at runtime but to write a static assertion that says yes I am definitely this is the thing I'm counting on and if it's not so break my build so that I can fix it so do you know if there's any implementations of it currently in the standard library or it requires compiler intrinsics?
Starting point is 00:34:05 It does require compiler intrinsics. And as far as I know, it's not yet implemented. The core group, despite the fact that I had to go to them several different times with revisions of the wording. I don't think they ever changed their initial determination that, yes, we can do that, and that's going to be fairly easy. writers to coordinate with their library team in order to get it in because it's actually in addition to the library but relies on compiler intrinsics you and i had a conversation was it two days ago i believe now uh here at cbp con um about just in general reinterpret cast and how it can be almost impossible to get right most of the time, which is part of that. Yes, and I think that was actually joining on the tail of somebody else's conversation about what paper I'd just done.
Starting point is 00:35:11 And you said you were also, there was some movement about the idea of being able to bless a pointer or something, whatever the terminology was. Yes. So there is a thing people want to do that really is not allowed by the current rules, which is to, say, load a block of memory off of disk or out of someplace, and then impose some complex data structure that's presumably you know, presumably made just from scalars, not pointers, or not made from pointers or things that wouldn't be preserved. I shouldn't say a pointer isn't a scaler. Right. But a pointer is a bad thing to load from disk. Yes. front desk yes um and but just to somehow say that data is there and i want to use it as a
Starting point is 00:36:10 particular type right and people want to do that um and for perfectly good reasons but it's very tricky to in the c++ memory model, the object model, to invent a new object out of thin air in that way, to say, poof, there's an object. And so, but there's work afoot that we may be able to make that work. And we are hoping that we will even manage to make it work in a way that coordinates with the C committee. And so, to some extent, I don seen that because that sort of work comes before SG-12, the study group on unspecified and undefined behavior and software vulnerabilities.
Starting point is 00:37:15 Okay. More for the unspecified and undefined behavior. You know, this is, we want to turn something that is currently undefined behavior into defined behavior if you do it right. Or rather, if you like call the special library functions that will do it for you in the right way. So do you think this does have a future? I think it does have a future. It's not a certain thing. And it's, you know, unless it snuck by me in a big way, it didn't make 20.
Starting point is 00:37:50 But, you know, I think it's something everybody wants to do. It's just something that's subtle. And so we do the subtle things slowly and hope to get them right because it's bad when we don't. In your bio, we mentioned that you design the software architectures for Tanium and BigFix. Can you tell us a little bit more about that? Okay. So Tanium is my current employer. I used to work at BigFix, but eventually I left. BigFix is now a part of IBM the same place. These aren't server room computers for the most part.
Starting point is 00:38:47 These are computers that are sitting on 100,000 desks throughout your organization. And so if you have 100,000, half a million computers in your organization that need to be kept track of and managed and so forth. Tanium is the tool for that. We sometimes think of it as taking the things you know how to do on the one computer in front of you that keep it managed and up to date and so forth, and letting us scale that to enormous size. So does it involve components that run on the machines as well? Yes, we run an agent on each machine and form them into a peer-to-peer network that
Starting point is 00:39:38 helps us get answers, you know, be able to ask questions and get answers back quickly and to deliver things to all the computers in a practical way. And one of the more exciting uses is in cybersecurity. And yes, I have had to learn to live with the area I am in is called cybersecurity. And it's a particularly interesting tool for doing postmortems of intrusions because we can collect information about what's going on across this enormous number of computers in a very fast way so that you can interactively ask a question, get the answer, see what the next question you should ask is. Okay. And it supports multiple platforms? It supports a great many platforms. I don't even know how many.
Starting point is 00:40:42 Sounds successful then. We're doing pretty well. Well, Lisa, I hope you enjoy the rest of the conference. Thank you so much for joining us today. You're welcome. Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, we'd love to hear about that too.
Starting point is 00:41:02 You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help support the show through Patreon. If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode was provided by podcastthemes.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.