CppCast - Analyzing Undefined Behavior
Episode Date: February 21, 2019Rob and Jason are joined by John Regehr to talk about his job as a professor at the University of Utah teaching C++ courses and some of his research projects including souper and csmith. John ...Regehr is a professor at the University of Utah where he's been on the faculty since 2003. He likes to work on compilers and software correctness, but used to work on real-time and embedded systems. When he has free time he likes to go hiking in the desert with his kids. News Five Awesome C++ Papers for Kona 2019 ISO Meeting The future of Catch2 Some C++ on Sea videos already available Between linear and binary search John Regehr @johnregehr John Regehr's Personal Page John Regehr's Blog Links Souper Csmith C-Reduce C++Now 2018: John Regehr "Closing Keynote: Undefined Behavior and Compiler Optimizations" Sponsors Download PVS-Studio Technologies used in the PVS-Studio code analyzer for finding bugs and potential vulnerabilities Hosts @robwirving @lefticus
Transcript
Discussion (0)
Episode 187 of CppCast with guest John Rieger, recorded February 20th, 2019. Sports analysis of programs written in Java.
In this episode, we talk about Kona papers and search algorithms.
And we talk to University of Utah professor,
John Rieger.
John talks to us about teaching C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
Doing pretty well, Rob. How are you doing?
I'm doing fine.host, Jason Turner. Jason, how are you doing today? Doing pretty well, Rob. How are you doing? I'm doing fine.
Getting over a little cold.
Just had family in town for the week, and life is carrying on.
Things get to be busy again for me, travel-wise, coming up here soon.
Looking forward to Core C++ in Tel Aviv.
Yeah, that should be fun.
Yeah, March is going to be as well, so hopefully we can work that out with our scam.
Yes.
Okay, well, I thought that was like to read a piece of feedback.
This week, we got a comment on the website from Ray saying,
Hey, Dark Matter Programmer here.
Fun show.
Go has this feature since 2011 that you guys were discussing regarding updating your code with new versions.
With the new release, Go tool fix could be used to update your code to new APIs.
Here's from Ray. So yeah, we're talking with Etiquette about, you know, Leaf and his language
and how he wanted to upgrade the code from one version to the next and how it might C++. I know
Titus Winters has talked about doing it. Yeah, and specifically, it sounds like Ray here is
talking, mentioning Go. He says to use new APIs, and it makes me wonder if it's breaking changes in the language as well,
or if it's just about APIs, which I think is more the kind of thing that Titus has been talking about.
But it'd be nice if we could break some old C++ features and just have it automatically get taken care of by compilers.
Right.
Okay, well, we'd love to hear your thoughts about the show as well.
You can always reach out to us on Facebook, Twitter, or email us at feedback at cbcast.com.
And don't forget to leave us a review on iTunes.
Joining us today is John Regeer.
John is a professor at the University of Utah, where he's been on the faculty since 2003.
He likes to work on compilers and software correctness, but used to work on real-time and embedded systems.
When he has free time, he likes to go hiking in the desert with his kids.
John, welcome to the show.
Hi. Thanks a lot. Thanks a lot for the invite.
Yeah. Utah gives you lots of opportunities for hiking, right?
Yeah. There's a pretty big mountain range right outside the city here. And then,
yeah, it's pretty nice. And then we have sort of good access to the national parks like Zion
and Capitol and such. Is University of Utah in Salt Lake City?
Yeah. It's basically maybe a mile or two down. It's kind of right along. Coming from Denver, I always get confused when I'm in Salt Lake City? Yeah, it's basically maybe a mile or two down.
It's kind of right along.
Coming from Denver, I always get confused when I'm in Salt Lake City
because I'm used to the mountains being on the west.
And when you're in the Salt Lake City,
I feel like I never know which direction I'm facing
without having the mountains clearly as a landmark for me like that
because I feel like, I don't know, it's less obvious there.
And it's on feel like i don't know it's less obvious there uh and it's on the
wrong side okay well john we got a couple news articles to discuss uh feel free to comment any
of these and we'll start talking more about uh the work you're doing okay great okay so this first
thing we have is an article uh from bartek's blog and we've talked about you know several of his
articles before and had him on the show once um and this is uh five awesome seaballs papers for the kona 2019 iso meeting which is happening
right now and uh yeah he starts off by doing a good review of what is already in seaballs
previous and then he highlights uh five papers that were just just being introduced with kona
a couple of which we've already talked about like like the No Discharge to Have a Reason paper
that was written by Isabella Muerta and John Monique.
Yeah.
Yeah.
The freestanding, if that goes through,
to be able to have some good guidance on that would be nice.
Yeah, and then there's another one,
integration of Krono with text formatting.
I know we've talked about the format library a couple times.
Having that apply to things like Chrono makes a lot of sense to me.
Yeah.
John, did you have a chance to look at this article?
I did, and I didn't have too many specific thoughts about those papers,
but I was sort of interested by some of the new items for C++20.
So I don't know that they're the most important overall,
but the ones that I found really interesting are the addition of contracts, I think is really great. I think
this is something when I teach students to program, I really try to lean on them to put a lot of
assertions in and do a lot of kind of try to try to make their assertions and their unit tests play
well together. And I think contracts will really kind of play to the strengths of that sort of,
that sort of thing. And then I also like the C++2's complement. That's kind of play to the strengths of that sort of thing. And then I also like the C++ is two's complement.
That's kind of a long overdue change, I think,
that's really nice to see.
That'll simplify a number of things.
So sometimes when I'm talking to people at contracts,
some people feel like it maybe wasn't quite ready for primetime.
Are you familiar enough with the actual what was accepted in the standard
to have any opinion on what was accepted?
No, no.
I looked it over, and it looks interesting,
but I have not been following the sort of technical development.
Okay.
I think the main thing that stood out to me
was in some conversation I had with someone
was it might not play nicely with constexpr,
which would be a huge deal to me,
but that's still to be determined.
Well, yeah, I think this is the thing with C++ is it's gotten pretty big and complicated,
and I guess everybody's aware of this, but it's one of these things where it's kind of
interesting to balance these really cool features with the fact that it's pretty hard to play
well together.
Yeah.
Okay, the next article we have is by Martin Horonovsky, hopefully I'm pronouncing that
right, and he's the current main maintainer of Catch-2.
And this article is about the future of Catch-2
and how he is considering,
and it sounds like he's going to go forward
with changing the distribution model
from a single header library
to actually splitting it up amongst multiple libraries
and V files and actually building it
so we can deploy it through things like VC Package and conan and he goes about some goes about some of his reasoning for
wanting to consider this change and uh and talks about you know his experience maintaining it
because he's kind of taken over the project uh from phil nash and done the main majority of uh
maintenance and closing issues over the past few sounds. Yeah, it sounds like a big change and a big deal.
I did not realize that Phil was not the main active maintainer at the moment.
Yeah, but it sounds like he got good feedback when he proposed this online,
and it sounds like it makes sense.
He might keep an older version of the library as a header only,
so if you want to use it in that fashion for quick examples, you can still do so which i think is probably a good move i'll say for my part i do
use the header only version but at this point i'm using it via conan and so if it if it was a
seamless change and the next version that it grabs just did the right thing then it wouldn't really
matter to me right he didn't go too much into specifics but what exactly are the problems with
uh ingesting single header libraries something like conan um well i mean all it does is download
a header basically in conan and then but actually using it is actually pretty easy but i think his
main concerns are things like compile time okay okay and then the the next thing we had here was
uh c++ on c where you just came back from, Jason.
They already have like almost 30 videos online.
So Phil is uploading these really quickly, I guess.
Yes, they are going very quickly.
Yes, that's great.
I'll have to start watching some of these.
Any specific ones you wanted to point out aside from your own talk?
Well, my talk's not up there yet.
Yours isn't actually here yet.
Okay.
There was technical problems. Oh, okay. But it should should be up soon it might not end up going up oh okay that's too bad
yeah but you know it happens uh i'm not too concerned about it i i went to you know you
can only go to so many talks when there's three four tracks happening uh but i really liked
bjorn's um programming with contracts in C++,
bringing it back around to contracts. Okay. And then the last thing we have talked about
is this article between linear and binary search. Jason, you want to talk about this one?
Yeah, this was actually given to me from someone that I met at the conference. I just clicked on
the article and a pop-up came up on top of it.
It was given to me by someone from C++ on C. And it's just an algorithm instead of doing a linear
search or binary search, he has this biased search where he's saying most of the most likely scenarios
that the thing he's looking for is at the top of the container. So doing like a power of two kind of search.
So it's like favors the top each time it searches. Now at the start, he says that he was unable to
find a name for this algorithm. But there's some possibility that it was like a skip search or
something like that. And since we have a computer science professor amongst us, I was wondering if he had any thoughts on this. No, I don't know the name for this,
but I did enjoy the piece a lot. I really like these kind of deep dives. And I always find it
really interesting how these things which seem so simple, these things we cover, you know,
kind of in, you know, first and second year courses often, you know, when you actually look
into it and look into the performance effects, there's almost this kind of fractal complexity where, you know, it just gets more and more kind of relevant detail and interesting stuff going on.
The deeper you look, you can, you know, it's, it's, it's, I really like this sort of piece.
I think it's great to, great to get this kind of, kind of this, this thinking put down somewhere where people can see it.
Right.
Thanks.
Okay.
Well, since you were just talking a little bit about, you know, studying computer science, what classes are you currently teaching, John?
Well, right now I have just one.
It's a software engineering course,
and it's a little bit different than our usual kind of courses
because this is for a new professional master's track we have.
We'll sort of come in and we do a kind of accelerated master's degree in 18 months.
Anyway, and I'm doing the software engineering component.
It's really been a lot of fun.
Right now they're writing an image compression assignment in C++,
and I was just thinking while we were at Catch-2 that I should have them maybe use that for some of their unit tests.
I hadn't.
We've kind of been doing homegrown unit test stuff,
and I think one of these nice solutions is the next thing I should do.
So I think you kind of indirectly brought up a question
that I was considering asking you about. Often when we have people who are fresh out of university or
whatever on the show, this discussion comes up about whether or not computer science curriculum
should be more of a theoretical thing or more of an applied thing. And it sounds like you're
talking software engineering is more like an
applied computer science kind of basis for this master's degree. I don't know, would you mind
going into that? Yeah, well, to sort of to start off the, you know, the question you raised,
of course, is one we sort of kind of fight with every day where, you know, we want to be relevant,
we want the students to kind of have seen things that they're going to see out in the world.
But on the other hand, we don't want to teach them things that are going to become obsolete in two or
three or five years. And so we try to walk this balance, walk this line
sort of just like you say. And this kind of professional master's track does fall more on
the applied side. But on the other hand, we try to teach, testing is this incredibly general thing
that requires a lot of creativity, a lot of hard work, a lot of hard thought.
And, you know, I try to emphasize that aspect of it
while also making sure that they see the kind of tools that I,
you know, some of the best of tools that they're likely to use out in the world.
Okay.
So you said this is the only class you're currently teaching,
but I'm guessing you've teached other courses in the past.
What are some of your favorite classes to teach?
Well, my current favorite ones I think right now are, I've been switching off every year
between advanced compilers and an advanced operating system.
And both of those we sort of, I make those quite applied as well, where we do some sort
of theoretical readings, but then we just spend a lot of time reading.
And in the operating systems course, we usually can read the entire source code for this operating system kernel called XV6.
It's kind of a reimplementation of Unix version 6, but for modern x86s.
And it supports multi-cores.
And, you know, this thing boots up and it's a real OS.
But in all other respects, it's kind of as simple as possible.
It sort of barely has a file system, barely has a shell.
Everything is really, really simple.
But you can read the entire kernel in a semester. and that's really fun for people, I think. It's really nice to see the
kind of diversity of things that live in an operating system, and I really kind of enjoy
walking around. And then the assignments in a course like that are, you know, adding system
calls, adding features to the file system, maybe tweaking the scheduler, these kind of things. And
unlike when you use Linux or something as a basis for an operating system course, you can kind of rewrite kind of major parts of this operating system because
it's so tiny. Whereas in Linux, we're really confined to kind of playing around on the edges
because it's this sort of monster monolith that's, you know, very, very kind of hard to operate.
So when your students modify the operating system and are playing with it,
do you use virtual machines for the testing? Yeah, absolutely. Absolutely.
So XV6.
And so I'll send you guys a link after we're done here.
But it's this open source.
It's written by some people at MIT.
And it basically comes with an environment where it boots up and it just basically works.
And it even, you know, you can bring it up in multi-core.
It all sort of is really seamless.
This kind of thing where I remember when I took an operating system course first, you
know, a number of years ago, you know, this is developing, we're hacking on
Minix on 286s. And, you know, you had to have dedicated machines, and you had to, you know,
we had to boot them off of floppies and stuff. And this was, this was, you know, kind of fun,
I think, at some level, but, you know, just dramatically more work than it is just to bring
something. Now, you also said advanced compilers is your other favorite course,
I believe that's what you said.
Yeah.
I was wondering if there's an intro to compilers and this is the second level,
or if that is like the main class is just called advanced compilers.
No, no, you're exactly right.
There's an intro to compilers course, which I, for whatever reason, haven't ever taught.
I think we just happen to have other faculty who cover it pretty well.
But yeah, in advanced compilers, what I do is base it on LLVM, which sort of, so it's a little
bit the opposite approach as I use an advanced operating system. So working with LLVM, now we're
stuck with a big monolith that we can't really change much of the core parts of. But what I do
is I have them work on new optimizations, work on static analysis, and sort of the theoretical
component of this course is static analysis,
which is sort of predicting the behavior of programs without running them.
This is very core to optimizing compilers.
You can, for example, predict that a pointer is never null,
the null checks that the programmer has put in,
they're just not ever going to fire.
And so we have lots of, so that's really a lot of fun.
And yeah, anyway, so that's sort of a neat course to teach.
Has it ever happened that any of the projects from your students
ever end up becoming pull requests into LLVM or Clank?
You know, I always kind of push that.
I always tell them I'll just give them an A if they can make this happen.
And, you know, it's just there's always so much time pressure
and so much time pressure.
And a lot of the stuff we're doing is kind of, you know, it's kind of, you know, sort of wonky experimental stuff.
Like I'll tell them to do something like increase the precision of one of the static analyses in LLVM.
You know, and people can do this, but, you know, when you really look at it,
is the slowdown of the compiler introduced by this change worth the increase in precision?
And usually it's probably not just because we're just sort of playing around.
And I have the meme for kind of low-hanging fruit. worth the increase in precision? And usually it's probably not, just because we're just sort of playing around.
And I have the meme for kind of low-hanging fruit.
And so a lot of times it's sort of,
you know, the stuff that actually needs to be done is kind of maybe less fun.
And so it hasn't happened yet,
but I keep waiting for it.
Now, I'm also curious,
going back to contracts,
like at least on a theoretical level,
what opportunity do you think there is
that contracts can lead to different optimizations
compiler? If it says, well, you've already promised me this thing's X or whatever.
Yeah, well, the straightforward sort of runtime interpretation of contracts is just to check them.
So, you know, at a surface level, they're only going to make things slower. And, you know,
I think this is a kind of slowness that we're happy to pay, right? Because the, you know,
we're trying to optimize for human time.
And we're sort of trying to document these important properties.
But, yeah, as you say, you know, you could turn these assertions into assumptions where the compiler is then allowed to assume that these properties hold.
And you could even push these things around in an interprocedural fashion.
And so there are, I believe, sure, there are opportunities
to speed up programs, but my guess is that those haven't been seen in practice very often, if ever,
if you see what I mean. Sure. But, you know, there's... Okay. So in addition to your coursework,
do you do research of your own? Yeah, I have some research projects, and basically they're
in compilers these days. And the one that I'm kind of most excited about right now, even though I've been working on it for
several years, is a super optimizer for LLVM, and what this does is kind of looks through
the program being compiled at the IR level.
So for those people who aren't familiar, LLVM is very centered around an intermediate
representation, an IR, which expresses the program, and it sort of looks a lot like an
assembly language. It's got explicit loads and stores, it's got adds and shifts, and expresses the program, and it sort of looks a lot like an assembly language.
It's got explicit loads and stores, it's got adds and shifts, and these kind of, the control
flow is not structured, it's jumping.
So it's very much a neutral, sort of an architecture neutral assembly language, and the way LLVM
works is a front end converts your source programming language, like C++ or Rust or
whatever, into the IR.
Then we have a bunch of language agnostic or at least mostly language agnostic optimization passes which run to try to improve
the IR and try to make it as sort of as clean and fast as possible by cutting out, you know,
maybe dead code. And then finally there's a back end which translates the immediate representation
into x86 or ARM or something. And so what the super optimizer does is it looks at IR, it kind
of runs as an optimization pass, and then it uses the thing that makes it different from the existing optimization
passes is it doesn't kind of have a canned idea of what to do to the IR to make it better. It
rather asks an automatic theorem prover. It kind of phrases a question about the IR in such a way
that an automatic theorem prover can come up with, potentially at least, an improvement to the IR.
And so one example that I'm kind of proud of is some codes end up with pop counts in them.
And pop counts are, this is an operation that counts the number of set bits in a machine.
And in some domains, pop count is kind of important, as you can imagine,
and most programs don't do this at all.
But things like chess programs and Vue.exe and things might end up doing pop counts.
So what we often do is we write an open-coded pop count.
That is to say we come up with some sort of an algorithm for a fast way to do it,
and we put that in our C++ code, and that sort of sits there.
And what that does is it misses an opportunity to use, for example, a better algorithm
or maybe an actual machine instruction for pop counts.
So, for example, a Core i7 will have a dedicated pop count instruction that's actually quite fast.
You know, on an Intel Atom or something, on a laptop chip, it might be microcon real slow,
but the high-end chips will have a very fast pop count.
So anyway, going back to Super, what Super can do is that using the solver,
it can actually, at least in some cases, not always, but depending on how it's coded,
but it can recognize the pop count from the open coded version.
So it can sort of see that this collection of instructions taken together does this operation
and then it's free to sort of change the implementation to some other one.
For example, LLVM happens to have an intrinsic for pop count and then the backend will translate
that into the most suitable for your target.
So it's really the kind of, the thing that's sort of fun about doing something like the
super optimizer is it sort of, it's not creative, of course, but it kind of has a little
bit the feeling of being creative. It comes up with sort of interesting alternatives to the code
that's there. And it's sort of, that's, I think, the reason I like working on is that it's sort of
fun to look at its output. It comes up with sort of clever things that humans wouldn't have thought
of in most cases. So is that at all part of the LLVM project? Is it something we download and run separately? It is totally not part of LLVM. And so the way it sort of currently works
is it runs as an optimization pass and it interacts with a cache. So discovering new optimizations
turns out to be pretty slow and inefficient. And so it's not something you want to really be
waiting for. So what we can do though is we can sort of strip a lot of interesting code and put
it in the cache. And then we can have some sort of a multi-core grovel over that cache offline.
And then next time you compile using Super, it would have access to the contents of that cache.
So that's kind of the short-term solution, the research solution.
And that's not necessarily a very good idea for a production scenario.
And so there's something that's not working at all yet, but that I would like to do in the future, which is generate rules for the compiler based on discovered optimizations that
sort of are very simplified versions of the optimizations that can run as part of.
So is this like, if you're working on a project where every nanosecond is important,
would you recommend running it through super in its current state today?
No, definitely not. Okay.
There are some real problems where, and one of the problems we've been fighting for multiple
years, and it's a very thorny problem, is LLVM has, at the IR level, at the internal
representation level, it has some concepts of undefined behavior.
And so most of us are sort of familiar with undefined behavior at the C or C++ level,
but these internal notions of undefined behavior are kind of even harder to think about,
because they're designed to sort of, they're designed to have sort of a partial effect on
the program meaning, which is to say when something is undefined, it's just that value
that's undefined, and it's not the sort of whole program which becomes undefined like in C++.
Okay.
And what that means is that sort of reasoning about these programs is actually pretty difficult,
and it turns out that a lot of different people contribute to LLVM, and not all of them have
sort of understood undefined behavior in exactly the same way.
The rules are not really clear anyway.
And so what that means is different parts of the compiler have made kind of conflicting
assumptions, at least at times.
And so there's a little bit of internal non-coherence surrounding this. And this tends to generally be latent.
This tends to generally not really affect users. But super, being based on an automatic
theorem prover, will go ahead and see. It's kind of, it's like a little bit like a lawyer.
It looks for a little loophole that it can exploit. And if something has been a little bit incautious about undefined behavior, Super will then break it. Okay. And so
right now what we're doing is working to create a version of LLVM which doesn't have any of those
problems. And it's turning out to, you know, be a bit of an engineer. And the sort of overall
problem though is that since these problems aren't really causing most LLVM users much pain,
there's not a lot of priority in the
community to fix it. And so it's a problem mostly seen by us, and so it falls on us to fix it.
Right. That's interesting. So do you have a plan for moving forward?
Yeah. There's a number of us who are sort of interested in this kind of making the version
of LLVM that doesn't have any of these problems. And so we're collaborating on this. And then
hopefully these solutions will be, you know, once these solutions are proved to not affect code generation quality, hopefully we can
move these kind of changes into the core LLVM distribution, and then you'll have these problems.
Can you talk any more about what exactly an automated theorem prover is? I don't think
we've really talked about that much on the show before.
Sure. Well, so what it is, is a program that you can ask essentially math problems.
And so the theorem prover that we mostly use here is called Z3, and it's a tool created by
Microsoft Research, and it's open sourced. So it's very easy to include into a project. But what it
does is, like I say, it attempts to answer math questions. And so if you ask it, for example,
for a solution to the N-Queens problem, what you have to do is kind of
describe the rules for the N queens problem. And so, sorry, I should explain what that is. The N
queens problem is setting a number of queens on a chessboard so that none of them can attack any
other. So that's a, you know, it's sort of a common logic puzzle that you might ask a kid to solve.
You might hand them a chessboard and a pile of queens and see if they can do it. But if you
describe the rules to this kind of a thing, then the solver can sort of come up with a solution to it using its own sort of,
it has kind of an internal sort of sophisticated collection of heuristic searches and other kinds of searches that allow it to home in on these solutions.
And similarly, you can ask it, for example, you can tell it the rules of Sudoku and give it a partially solved Sudoku,
and it'll come up with a solution quite a lot faster than a brute force Sudoku solver. So it's using these kind of sophisticated internal
algorithms, and they tend to be pretty domain independent. They tend to be pretty independent
of what you ask it. And so of sort of a wide variety of kind of questions that you might want
to ask about mathematical situations, it turns out that these things are sort of pretty okay at
answering. And it turns out that in areas, for example,
like formal verification of software,
these used to be kind of prohibitively done.
It used to be incredibly hard to kind of actually conclude
that a sorting routine actually sorts its output
or that some sort of routine in C++ that's controlling a robot
doesn't ever access a null pointer and crash the robot.
So this used to be a whole lot of work,
and it turns out that these automatic theorem provers offer us a way out. If you describe the
meaning of the programming language to the automatic theorem prover, then you can often
just ask it the question, does the program ever dereference a null pointer? And it will try to
either come up with an example where it does, or it will try to sort of prove that it doesn't.
And that's been one of the main kind of use cases of these automatic theorem provers is
to try to find defects in things like chips.
So, for example, I have a friend at ARM.
So, you know, so ARM, of course, makes or designs at least the processors used in almost
all mobile devices.
And he works on formal verification of their processor designs using provers like Z3.
And so we can also apply these to software.
And like I say, this has been kind of the main use case of these sort of tools in recent
years.
And not as much attention has been paid to kind of trying to optimize software using
these tools.
And so that's why I thought it would be kind of fun.
I wanted to interrupt the discussion for just a moment to bring you a word from our sponsors.
PVS Studio Analyzer detects a wide range of bugs.
This is possible thanks to the combination of various techniques, such as data flow analysis,
symbolic execution, method annotations, and pattern-based matching analysis.
PVS Studio team invites listeners to get acquainted with the article
Technologies Used in the PVS Studio Code Analyzer for Finding Bugs and Potential Vulnerabilities,
a link to which will be given in the podcast description.
The article describes the analyzer's internal design principles So are there, I mean, maybe kind of, I don't know, taking this back a step, but are there any existing free tools out there that can prove things about our C++ programs to like prove the correctness of them? Well, this is, yeah, this is a really great question.
And it's a little bit of a thorny issue
because the problem is most of these kind of tools
are made by people like me,
that is to say by researchers
who have a pretty low engineering budget,
if you see what I mean.
And the low engineering budget
combined with the size of C++ is really not a good match.
And so almost all of the people like me work on C.
And that may also be a good match because a lot of safety-critical software is in C,
and a lot of avionics software.
I don't think much of that is in C++ yet.
And despite the fact that we see, I think there's going to be increasing uptake of C++ in embedded systems,
a lot of the existing stuff is C.
And so this is a bit of a
thorny problem, and the kind of tools that reason about C++ end up being produced by organizations.
And so, for example, the Clang Static Analyzer, which is, I'm not sure if it's part of LLVM or
if it's a side project. Anyway, this can reason about your C++ code and look for likely bugs in it.
And in some versions of this,
at least some of the time,
will call out to an automatic server
to answer sort of questions
about whether something can happen.
Okay, that's interesting.
Yeah, so it's really sort of a problem.
I mean, it's a little bit depressing,
but like I said, the size of the language
really, really does not work in favor
of the small players like most of the academics. So another project you're working on
is C-Smith. Is that right? Yeah. Well, I'm not really working on it anymore, but I spent a number
of years working very hard on this project. Yeah. Could you tell us about it? Sure. So maybe I'll
just sort of start with a little bit of a story. So a number of years ago, I was teaching an embedded systems class,
and I was showing the students in a slide the translation of some little C and C++ snippets to assembly language,
and I was showing them sort of how the volatile qualifier works,
how if you don't have it, then things will be optimized a certain way.
And if you make some variable volatile, then the compiler has to emit these loads and stores to do memory operations.
And of course, when you're writing embedded software or operating systems and things like
that, where you're talking to memory mapped peripherals, the presence or absence of a
load or stores is really, this has to be right.
So I was lecturing on this and one of the students pointed out an error in my slide
and said, hey, that translation is wrong.
And so I said, oh, well, I probably cut and pasted wrong.
But when I went and looked at it later, it turned out the compiler had translated the compiler.
And it got the volatile wrong. And so I kind of shelved, put this on the back burner, and then
somehow came back to it later. And it turned out that the harder I looked, kind of the more bugs I
saw with volatile. It was really, compilers were really not doing a good job following the rules.
And you can sort of see the problem as a compiler writer. One problem is that these loads and stores really aren't visible normally. If your program does or
doesn't have them and you're running a test suite or something, it just runs, right? You don't really
notice this. So the compiler developers were having a hard time noticing that they were wrong.
And then the other thing that was happening, I think, is that compiler developers love to write
optimizations. They love to make code go fast. And it was really, really easy when adding a new optimization
to forget to check if some variable was volatile before doing some optimization on it.
Interesting.
It really flies.
This requirement to not do these optimizations really flies in the face
of what optimization implementers want to do with their time.
And it just turned out to be that people miss this a lot.
So anyway, so what happened is we started to, I ended up with a file full of test cases that lots of compilers
got wrong and started reporting some bugs. And then one of the things a colleague and I realized
was that, well, maybe we should start to look for these bugs more systematically. And so my
colleague, Eric Eide, took an existing C program test case generator.
So it was a program called Randprog that generated random C programs.
And he took that and he started retooling it and reengineering it to be about looking for errors in volatile.
And then what we did is we would run these programs.
We'd generate a random program.
We'd compile it with and without optimization.
And we'd run it in kind of a hacked Valgrind, which would report the number of loads and stores to every variable.
You know, Valgrind provides this very, sort of, very interface-proof. So we had a hacked Valgrind
that would report the number of loads and stores, and if that ever changed when you cranked up the
optimization level, then the compiler was wrong. As long as, you know, for a memory location that
was only accessible through volatile qualified story.
Okay.
So we started doing that, and this was a lot of fun.
We reported a bunch of bugs, and we were having fun, and then eventually what we realized was
is that the volatile bugs weren't the only ones we were seeing.
They were, you know, compilers were just sort of wrong.
In general, they were just wrong.
That's great.
You know, they're not very wrong, right?
Because, you know, they work
right there, you know, they're incredibly amazingly reliable. But when you almost anybody who writes
weird code has stories about the compiler being so almost anybody who writes something that just
looks funny, like maybe you're writing an operating system, and the compiler hasn't really
very much been used for that, you tend to run across compiler errors, and people sort of learn
to avoid, you know, bit fields or whatever, you know, bit fields are fine now, but, you know, 20 years ago,
they were often a source of. So we started looking for regular compiler bugs. And then the project
kind of snowballed a little bit because we just kept, everywhere we looked, we found them. We
found them, you know, not in GCC and LLVM, but also in, you know, compilers used to compile
source code that ran on airplanes and the source code that run on spacecraft.
And so all of the heavy-hitting, expensive compilers for embedded systems of the time,
all of these contained these kind of bugs where we could sort of pretty easily trip up the compiler with a randomly generated program.
And so anyway, I ended up over the course of some number of years reporting like 500 compiler bugs.
I kind of, this is kind of like my
hobby and nighttime job for years. And I eventually got sort of tired of it. I eventually burned out
a little bit, which is why I stopped. But, you know, I spent a lot of time reporting compiler
bugs. And I like to think that at least in the case of LLVM, that we did the project kind of
at a good time where it was in a bit of an early stage, and especially Clang was in an early stage.
And I think it came, the project came at a good time to sort of help that code base kind of reach a level of maturity that it might not have
gotten maybe until a little bit later if we hadn't done this work.
Do you have any idea for how many of those 500 some odd bugs were actually fixed?
Almost all of them. So the LLVM community and the GCC community were really, really receptive. They were
really, you know, probably more gracious, you know, because at some level, you know, I'm helping them.
But on the other hand, some number of these bugs probably weren't ever going to affect anyone.
And the distribution between those two things, you know, which bugs were, was I really saving
someone's bacon and which bugs just didn't matter, that's really almost never clear, except in a few
cases where I would report a bug
and then other people report the same bug as well
that they encountered on their own basis.
So that kind of thing did happen,
but it really wasn't the majority experience.
So anyway, yeah, so we ended up reporting, yeah, so, yeah, sorry,
I guess I lost track of your question.
I got divided myself there.
That's fine.
We're talking about the tool that generated the random code, I believe,
C. Smith.
Yeah, anyway, so, yeah, I ended up finding these kind of errors in almost all the compilers.
No, I guess I would say in all compilers we tested, and it was one of these things that
ended up being a fun project, and happily we could open source it.
And one time we got a Christmas card from a compiler development team at an embedded
software company.
They sent C Smith's Christmas card because I think
the implication was that it
saved them a lot of work by finding
lots of bugs. So this is really
sort of a gratifying project in the sense that it's gotten
a lot of industrial uptake.
I think it kind of solved a problem
that people had, but they hadn't really thought that there
was an automated solution to.
So this might actually be
derailing us slightly here, but I think JF
in particular, and I know there's been some discussion about like trying to deprecate
volatile and replace it with something that is more well-defined semantics. And I feel like
the average C++ programmer doesn't really know what volatile is for. Can you speak about volatile
for a minute? And if you have any thoughts on that? Sure. Yeah, it plays a really indispensable role in kind of low-level codes
like operating systems and embedded systems for accessing, you know, things like page tables,
for accessing things where the actual presence of a load or store operation matters. And of course,
you know, this is a pretty small fraction of programmers. And C++, my take is that C++
probably incorporated volatile in the wrong
way. It incorporated it as something that's kind of a property of, it's very deeply tied into the
type system. And if you read the C++ standard, you can see places where it affected the language
negative. You know, it does things like it affects overload resolution and things that it just
shouldn't ever probably have gotten tied into. And so Jeff's proposal, I think, is an extremely good idea,
which is basically to try to strip out some of these things
that Volatile shouldn't have ever been doing in the first place,
you know, while actually retaining the parts of Volatile that benefit systems.
And so what you really want to do is mark a particular load in store
as needing to happen or not needing to happen.
But this has nothing to do with your big class heart
and really should not be playing into
overload resolution and stuff. So this is one of these
things where I think it's a cleanup
that really kind of should happen
and needs to happen. And my
take is that it will take some of the C++ header
files and knock some pretty
big chunks of junk out.
If it were removed as a possibility
for a member function
qualifier,
that would halve the number of overloads for many utility funds.
That's right. That's exactly right.
And that's, so it was a while ago that I looked at JS proposal,
but that's my understanding is exactly what he wants to do.
Okay. Okay. You worked on another project called C-Reduce. Is that right?
Yeah. So C-Reduce was kind of C-Smith's brother or sister.
And so the issue here was, is that, so you would generate a program, and the real crux of the issue with C Smith, so
generating a random program isn't that hard. You just kind of spit out syntactical elements and try
to connect them in a way that they'll parse, and that's not really that hard. But the problem is
if we want to, if all we want to do is try to crash the compiler, then we just sort of blurt out a lot of, you know, syntactic C or C++, and we sort of, we see if it crashes the compiler.
But that really wasn't the goal. The goal was to find wrong code bugs, where the compiler
translates a program into an executable that doesn't respect the intent of the source code.
And to do that, you have to generate programs that don't have any undefined behavior. And so
the real crux of C-Smith was generating code while sort of on the fly of the source code. And to do that, you have to generate programs that don't have any undefined behavior. And so the real crux of C-Smith was generating code while sort of on the
fly analyzing the code for undefined behavior and kind of changing the trajectory of the generation
so that nothing it emits is ever undefined, if that makes sense. And so that's the job that the
tool does, and it did a pretty good job at it. But the problem was is that what would happen is you
would compile a program with
Csmith at GCC-0 and it would print a checksum of its global variables at the end. And then you would
compile the same program at GCC-01 and it would print a different checksum. And so now the question
is where in this mountain of random junk is the problem? You know, what in this mountain of random
junk triggered the miscompilation? And this is really not easy.
And so the way people deal with this problem is they basically delete some of the code and see if the code still triggers the bug. If it does, then you can delete more code. And if it doesn't,
you have to undelete and delete something different. And so as the kind of first major
user of Csmith, I spent very much time kind of on this exact task. And, you know, it got old.
And C-Reduce is a program which basically automates this. It deletes code, but it also
does a lot of other things. It inlines functions. It tries to instantiate templates. So it started
off basically solving a problem created by C-Smith. But then later on, it expanded to try to
be a pretty far-reaching test case reducer,
which basically tries to make a C or C++ program smaller
while still preserving some property that you care about.
And you teach it about that property by basically just giving it a shell script.
And the shell script can look for a compiler crash or for a miscompiler.
It can do whatever it wants.
But CReduce will change the program to make it as small as possible
while still preserving the property that the shell script returns true when it runs on that program.
And so what it turns out, though, is for whatever reason, most programs that trigger a compiler bug can also be triggered – or sorry, most compiler bugs that get triggered, you know, maybe by like a three megabyte C++ file.
So you know how it is when a C++ file has been preprocessed.
They tend to be
not small. Right. Not small. Most compiler crashes that can be triggered by three or four megabytes
of stuff can also be triggered by maybe 50 bytes or 80 bytes or even 10 or 20 bytes. Wow. And so
C-Reduce kind of tries to find that smallest program. And it's an intractable search problem,
so we can't ever hope to find the smallest program that triggers a bug.
But if we can get something that's a couple hundred bytes,
then what we do is we've handed the compiler developers something that they can use to,
handed them sort of a pretty good weapon they can use to actually track down the bug.
And so, like I say, C-Reduce has gotten a pretty decent amount of uptake among compiler developers
just because it kind of solves a problem that's sort of, you know, it's really not really very fun to solve by hand. And it wasn't the first
tool of its kind. There were some previous tools that did kind of similar things, but at risk of
not being very modest about this, it does it much, much better than other tools.
So if I find a compiler bug, do you suggest that I run it through C-Reduce before handing it off,
or are you more expecting that the compiler writers themselves are using C-Reduce on the repos that they're given?
Yeah, that's a really interesting question.
They, no doubt, are used to it because I think they can't expect a customer, a paying customer maybe, or a busy user to run C-Reduce.
But my experience as a random tester was that if I didn't do this, they probably would.
Just because the test cases I was coming up with were so ridiculous to look at.
But, you know, I would encourage you to do this.
If you ever do see a compiler crash, I would encourage you to run C++,
just not so much because you have to, but because it's really pretty fun to see what comes out at the end.
So if you do that, I hope you'll drop me a line and maybe share the snippet of code that it ends up with, because it really is sort of fun to, it really is kind of satisfying
to watch it strip away a lot of stuff that's irrelevant and leave you kind of with this core
program, which ideally at least is pretty small. And it can get, C-Rooters can be tripped up by
templates and things. Templates are pretty hard to destroy and class hierarchy and things like this.
And there's, we did a lot of work. We spent like, you know, me and a student spent a couple of years working on this.
And we did an okay job, but there's a lot of work left to do.
We're really not sort of, we really don't have the engineering power to do this work.
So we're mostly in maintenance mode on C-Reduce, but, you know, that's fine because it works
pretty well.
Well, so last year at C++ Now, you gave the closing keynote on undefined behavior and optimizations.
And I think this is a good time to discuss that.
Undefined behavior is a large topic, it sounds like.
You mentioned it already with C-Reduce, and that really stood out to me.
You said that part of the goal is to remove undefined behavior from the program
when it was generating your random code snippets.
Did I hear you correctly on that?
Well, let's see.
Let me try to restate what you said.
So C Smith has to generate programs that aren't undefined.
That is to say, they don't trigger any undefined behavior.
And C reduce, then will go ahead and introduce undefined behavior.
Oh.
Because it just makes kind of randomish changes to the code.
It just sort of destroys syntactic elements. And so if you're reducing a compiler crash, then that's usually not a problem
because most of the time the compiler developers will fix a crash even if the program that triggers
the crash contains undefined behavior because it's basically a quality of inflation issue.
They don't like to see seg faults in there. Yeah, that's good. Yes.
But if you're talking about a miscompilation,
now the program has to remain free of undefined behavior, so you need a separate undefined
behavior checker running in the loop with C-reduce, because otherwise, with 100% probability,
it will end up with an undefined program. And the reason for that is that every step,
every reduction step that C-reduce tries has some small percentage of introducing undefined behavior.
But once the program becomes undefined,
now it sort of changes its meaning across optimization levels in the way that
undefined programs do.
And now it'll stay undefined all the way until it's minimum.
And so you have to stop it from that.
And so basically the strategy is this little shell script that you write,
we call it the interestingness test.
The question is, is the partially reduced program interesting?
And C-Reduce's job is to always generate interesting programs.
And any time it generates an uninteresting one, it just has to backtrack.
So in this interestingness test, you have to not only look for the miscompiled, but also look for your...
So as long as you have an undefined behavior detection tool, like the undefined behavior sanitizer for LLVM,
then you can more or less reliably avoid undefined behavior during the
test case reduction. And at the end, you'll end up with something that you can actually remove.
Okay, so you just, you've made a sentence, I'm going to attempt to repeat it. You said,
when there's undefined behavior, it changes the nature of the program across optimization levels
as undefined behavior does, something like that. Which I'm guessing to many people listening,
they said, how does undefined behavior affect the
code generation across optimization levels? Sure. Yeah, no, that's really good. That's really good.
And so what this kind of comes down to is, so when I teach people programming, what somebody
will often want to say to me is, hey, the compiler's broke. And the reason it's broken is because I
have this program that did something, and I turned on the optimizer, and now the program crashes,
or the program prints something.
And what I always tell them is, you know, it's possible the compiler's broken, right?
I mean, compilers do contain bugs.
We know this.
But it's 99.9% likely that you made a mistake.
For example, maybe you have a printf that specifies three or four.
Let's say you've put three format specifiers in a string,
but then there are only two arguments
passed to the printf subsequent to the string.
And so I'll just use the C example
because the C++ example,
C++ of course makes this much harder to make this mistake.
So of course that's great.
But anyway, so in the C program,
what the generated program is going to do
is look at stack locations
that don't correspond to anything in particular. And so it'll just print garbage for those third and fourth format specifiers.
And now when you optimize the code, the stack layout tends to change because, for example,
certain variables are allocated in registers instead of going onto the stack. So the stack
frame changes its shape. And now what gets printed in those garbage slots of the printf
almost certainly changes. And so that's just a sort of those garbage slots of the printf almost certainly changes.
And so that's just a sort of a simple example of the kind of thing that happens is once you break
the rules in an unsafe language like C and C++, then the optimizer doesn't have to make your
program do the same thing. Whereas ideally, at least, if your program does obey the rules,
cranking up the optimization level should never change your program behavior, except in ways that the language refuses to acknowledge, like timing and things like that.
Okay. So I don't know, did that kind of make sense? Yeah, well, yes, it did. And so do you
think that this is a good thing? Or do we think, do you think we should be reducing the places that
are undefined in this standard? Yeah, it's an interesting question. The reason undefined behavior kind of came into existence in the first place was
because the language is a little bit loosely specified. And here we're talking about originally
in C. The language is a bit loosely specified, and so different implementations ended up doing
different things. And part of this is by design, because the language was intended to take
advantage of platform efficiencies.
For example, adjusting the size of an int to be the native size.
And, of course, this is very good, right?
This is why it's a good systems language.
And when the standards committee went to codify the language,
then they kind of ended up having a bit of a reverse engineering problem
where they didn't want to say that one of the C implementations was wrong.
They kind of wanted to encompass most of the existing one.
And so, of course, in some cases, they did say that the existing implementations were
wrong, but in a lot of cases, they didn't want to force too much reengineering of compilers
because they were afraid, and I think legitimately afraid, that it would then sort of just ignore
the standard.
So they said, okay, well, something like overflowing a signed integer, that's undefined because
the ones complement people do one thing in this case, the twos complement people do another thing, and the sign magnitude people do a third thing.
And so we're not going to try to put an interpretation on overflowing a signed integer because there's no single interpretation.
So that's kind of how that kind of came about.
And then more and more undefined behaviors sort of kind of crept into the language as the
standards evolved. And so, you know, even in the most recent C++ standards, more undefined
behaviors get added. You know, not very many, and I don't think they're particularly pernicious ones.
But, you know, there's this kind of a way out that the standards committee has here, which is to
just not try to enforce implementations doing something in particular
when something crazy happens in the process.
And so what it does is it makes the compiler implementer's jobs quite a lot easier because
there are these cases where they just don't have to care about, if that sort of makes
sense.
And this is all well and good, but it is a bit programmer hostile.
Some of this stuff is a bit programmer hostile, where some of the undefined behaviors are
so hard to avoid and so harmful in practice
that, you know, it's really a problem.
And so, you know, the canonical example of that is things like use after, where it's
just very, very tough to avoid use after free in a base, you know, and, you know, so real
programs do this.
And, you know, things like use after free and accessing past the end of an array end
up at the root of a lot of security problems that we have.
And so some of these undefined behaviors have ended up being um you know in
ways that the original standards authors couldn't have necessarily seen they've ended up being um
you know causing some problems in the world so do you see a practical like way out for that like
it seems that having checks uh for things like that be, would that add too much overhead or not in C++ for use after free?
Yeah, so that's a really good question.
And I think it's one of the real central questions
for a future of C and C++ is what to do about these things
because otherwise it's very hard to write reliable software.
And so one answer is that we use tools,
basically hacked compilers to insert these checks, as you say.
And this is a really, really good idea.
And so, for example...
Like the sanitizer. Oh, sorry.
Yeah, exactly. The sanitizers are the example.
So address sanitizer, for example, causes the compiler to insert code that traps kind of a large variety of really pernicious undefined behaviors relating to memory.
And this is 100% these tools just needed to exist.
And the fact that they do now has really changed the game for these languages.
When I teach C and C++, I basically tell the students to leave these sanitizers turned on
throughout their entire development.
And I see, looking over their shoulders, I see all the time it catches, you know,
just even little things like shifting a number by a negative shift exponent. You know, this kind of thing really could have unpredictable results using compilers
that we use from day to day. And these tools are just incredibly valuable. And they really,
like I say, I think they really have changed the nature of developing images. And then the
case that we have left is that the tools often have too much overhead to put into it. And one thing that's kind of interesting, though,
is that some parts of, for example, the Android system
are compiled with some of the undefined behavior sanitizer's checks
left in in deployment.
So some of the video codecs, I believe,
they were really worried about buffer overflows
and basically errors in parsing a video file that came from a potentially hostile source.
Right.
And so what they've done is just left some of the checks in,
and they don't add that much overhead.
And so address sanitizer is kind of a pretty heavy-hitting collection of checks
that can significantly affect runtime,
and that doesn't tend to be left on.
But some of the lighter-weight checks, like for integer overflow, can be left on
as long as you're willing to accept that every now and then the program's going to trap
when it hits an integer overflow and exit the program instead of continuing on,
possibly producing erroneous results.
So it really just depends on what you want to happen.
The program does something bad.
And of course, this is common to all programming languages, right?
This isn't at all unique to C or C++.
Now, this particular tradeoff is a tricky one.
But the thing that's really different about C++ is there's this default decision to not trap any of this,
just to let it happen and let the execution go on, possibly making no sense whatsoever
and possibly now being controlled by an adversary.
So in your role doing research with code correctness and undefined behavior and these kinds of things,
what would you leave our listeners with? What would you say this is like the most important thing you should
be thinking about, I guess, if that might be a big question. No, I think it's pretty easy. I
think programmers should just basically turn on all the tools. Basically, these tools are there
for a reason. They produce really, these days, really pretty good diagnostics
when you overflow a signed integer or something.
It just tells you what you did at what line with what value,
leaving these things turned on during development and during testing.
And then, you know, then at deployment time,
we have a decision to make like we were discussing.
But, you know, that's sort of a domain-specific tradeoff that has to be made.
But during development and during testing, it's absolutely clear that you just use all
of these tools.
And what I find is that a lot of people are aware of these tools or aware of some of them,
but not everybody knows about them.
And, you know, and so I think it's really, I'm really glad you asked me this because
I really like this opportunity to try to help spread awareness.
And so basically what we're talking about, just to be concrete here, is if you're compiling a C++ program with Clang++, you add the command line option, dash F sanitize equals address comma undefined.
You can use those together.
Yeah.
Those two sanitizers cooperate nicely, and between them they add a really broad suite of checks to your program, which catches almost all of the really, really heavy
hitting undefined behaviors. And so this one small change, you know, it makes the program slow down.
So now maybe you're 20%, 30% slower, but you know, this doesn't matter too much during development.
And everybody should basically who's using C++ should do this all the time.
Okay. Okay. Sounds like good advice. Well, thanks for coming on the show today, John. Where can people find you online?
Well, I'm on Twitter mostly.
Email is sort of, I guess, always bad.
Just because everybody gets so much.
Twitter's a place to find me online.
But, of course, I do read email.
Yeah, one of those places would be great.
Okay.
Thanks so much for coming on the show.
Yeah, well, I really appreciate the invitation, guys.
Thanks a ton.
Thank you.
Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast.
Please let us know if we're discussing the stuff you're interested in,
or if you have a suggestion for a topic, we'd love to hear about that too.
You can email all your thoughts to feedback at cppcast.com.
We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter.
You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter.
We'd also like to thank all our patrons who help support the show through Patreon.
If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast.
And of course, you can find all that info and the show notes on the podcast website
at cppcast.com.
Theme music for this episode
is provided by podcastthemes.com.