CppCast - rr
Episode Date: December 2, 2015Rob and Jason are joined by Robert O'Callahan from Mozilla to discuss the RR project. Robert O'Callahan has a PhD in computer science at Carnegie Mellon and did academic research for a while a...t IBM Research, working on dynamic program analysis tools. At the same time he was contributing to Mozilla as a volunteer, until he switched gears to work full-time with Mozilla; Robert has been working on what became Firefox for over 15 years, mostly on layout and rendering in the browser engine and on related Web standards like CSS and DOM APIs. Lately he's been devoting about half of his time to rr. News Breaking all the Eggs in C++ The wind of change Celebrating 30th anniversary of the first C++ compiler: let's find bugs in it Robert O'Callahan Robert O'Callahan's website @rocallahan Links rr project Mozilla on GitHub
Transcript
Discussion (0)
Episode 36 of CppCast with guest Robert O'Callaghan recorded December 2nd, 2015.
In this episode we discuss breaking all the eggs in C++.
Then we'll interview Robert O'Callaghan from Mozilla.
Robert will talk to us about RR and how it can change the way you debug. Welcome to episode 36 of CppCast, the only podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
Doing great, Rob. How about you?
Doing good. We've got a lot of catching up to do,
because it's been about three or four weeks since we actually recorded our last episode.
To listeners, it's only seemed like a week, because we pre-recorded a bunch,
but we actually haven't talked in quite a while.
Yes. hopefully we remember
how to do it yeah so at the top of our episode i'd like to read a piece of feedback uh this one
came in a couple weeks ago from akshay and he wrote in hey i love cpp cast can we have a person
from mozilla who works on firefox or any of the projects as they do tons of amazing c++ work
like rr i think that sounds like a great idea for a guest, Akshay.
So we'd love to hear your thoughts about the show as well.
You can email us at feedback at cppcast.com
or you can find us on all social networks at CppCast.
We're on Twitter, Facebook, and iTunes.
We appreciate those iTunes reviews as well.
So joining us today is Robert O'Callaghan.
Robert has a PhD in computer science from Carnegie Mellon and did academic research for a while at IBM Research, working on dynamic program analysis tools.
At the same time, he was contributing to Mozilla as a volunteer until he switched gears to work full-time with Mozilla. He's been working on what became Firefox for over 15 years,
mostly on layout and rendering in the browser engine,
and on related web standards like CSS and DOM APIs.
Lately, he's been devoting about half of his time to RR.
Robert, welcome to the show.
Great, thanks for having me.
Well, that's an amazing time span there.
When did Firefox actually get its name?
Do you recall?
How long has that been?
It's so long ago.
I think it was 2002 or 2003.
Wow.
We actually went through a few different iterations of the name
because of trademark issues.
That's cool.
Okay, so we have a couple news items we want to get through
before we start talking about RR.
This first one is a Scott Myers blog post titled Breaking All the Eggs in C++.
And Jason, we were both kind of looking forward to this article because we kept talking to Scott after we had him on the show a couple weeks ago.
And he kind of hinted that he was working on this type of article.
So it's nice to finally see it. Yes was looking forward to it definitely yeah and just to go
over a couple things he's suggesting um he wants to make it so you know if you're overriding a
virtual function you have to say override getting rid of null and zero in favor of null pointer
um some very interesting things.
And what he goes into at the end is that
all these changes that he's proposing, you should be able to create
a Clang-based tool in order to make
these changes to a code base.
And he's saying the standards community should basically work on
deprecating and then removing these features over a course of 10 years.
Which sounds pretty reasonable to me.
What do you think, Jason?
I don't know how I feel about actually completely deprecating the features, but I could totally
see modes and compilers or something that would say enable this stricter mode.
I don't know.
I don't, I mean, mean it's I guess maybe I was
looking for something
maybe a little bit more drastic
to see like
let's really get rid of some stuff
but I don't know what that would be yet
yeah
Robert what are your thoughts about this
you know you can't
it's hard to undervalue
backers compatibility
so it's changing stuff like this is really scary especially You know, you can't under... It's hard to undervalue backwards compatibility.
So changing stuff like this is really scary,
especially, you know, I'm used to working on the web where there's a whole lot of old, old websites
that browsers have to keep running.
And so we take backwards compatibility very, very seriously.
I think the situation in C++ isn't quite so bad
because if someone's running the compiler
and compiling the program,
then they must be,
or hopefully they can actually change it
and fix it for a new C++ standard.
Yeah, and we're only talking about changing this
with future versions of the standard,
like deprecating null and zero as null pointers
in C++20
and then maybe removing the actual feature from C++23 or something like that.
So unless you're updating your code base to a C++ whatever compiler,
these things won't even affect you.
Well, and as Scott points out, you can use things like Clang's tools right now.
Basically, all of these examples could totally happen tomorrow,
and we could use Clang's tools to rewrite the code to meet the new standard.
Right, right.
So as long as that path exists, yeah.
So speaking of getting into new versions of the standard,
this next article comes from Meeting C++
where Jans actually tweeted out a survey
to see what version of C++ developers were using.
It's a little biased because the people who are following
the Meeting C++ Twitter account are more up-to-date with things
and they're active and interested
in where c++ is going but still it's a very interesting poll and the c++ 11 is the most
used version according to this twitter poll with 57 and 17 using c++ 14 so you know almost
three quarters of users using 11 or 14 which is pretty impressive
jason do you have any thoughts on this oh yeah i've i mean it's it's great to see people moving
that way i think everyone i know who's currently not using the new standards is because of some
business reason they have to support ancient compilers basically right it's unfortunate
how about mozilla robert what is Mozilla using these days?
We're mostly, well, we're using C++11 a lot,
and it's been great.
Our biggest problem is on Android.
There's some old, the libraries there are not so good,
so especially STL stuff on Android is not so good.
But we're using 11.
We've gotten into the habit of introducing new features as quickly as we can
based on the platforms we're supporting.
We're trying to keep our compilers up to date to make that possible.
It's been good.
I forgot about that.
I've gotten a couple of support requests for my open source project
for people complaining I can't compile on Android because of STL issues.
Yeah.
Okay, and this last
article is pretty interesting.
We talked a couple weeks about
a week ago about how it was the
30th anniversary of the first C++
compiler, Cfront.
And someone went
ahead and looked at the code
which is now open source and went and looked at the code, which is now open source,
and went and looked for bugs in it,
which was pretty interesting.
Jason, did you take a closer look at this one?
I did, I did.
And I would totally recommend reading this article, by the way,
going through each of the examples,
because there is a couple of random...
Well, okay, so the first thing that struck me, if I may,
is Seafront, let's see, I'm looking at the date here, 1983, right, was the first thing that struck me, if I may, is Cfront.
Let's see, I'm looking at the date here.
1983, right, was the very first type.
So I'm looking at these calls to C standard library calls
and really just letting it sink in that they have been unchanged for at least 32 years.
And that just kind of impresses me a little bit.
But he's making some mistakes
that would totally be caught by a modern compiler,
like passing the incorrect type of arguments to fprintf.
But there's a couple of little things
and a couple of little tidbits,
and Bjarne responded to this article,
and they have his responses at the bottom.
So it's a neat little piece of history, and you could probably learn something about good quality code yeah definitely agree
with that okay robert let's talk to you about rr we actually went over this a couple weeks ago
as a news item but uh for anyone who hasn't listened to that episode, could you maybe give us an overview of RR?
Sure. Well, I mean, the name, as the name implies, RR is really about recording and replaying program
executions. And what that means is that you can run RR on a program, which could basically be a
set of processes, you know, a whole tree of processes,
and it will record the entire execution
in essentially perfect detail,
which means that we can replay that
and the program will take the exact same memory
and register states as it executes
during the replay as it did during recording.
That means basically all the information you need
to debug a failed execution is there,
and there's really nothing missing.
So you get a perfect replay.
And of course, to make that useful,
we have to make it possible for you to actually debug
the program as it replays, and we do that.
We have integration with GDB,
so that you can apply GDB to the replay as it replays and we do that we have integration with GDB so that you can apply GDB to the replay as it happens and we actually then went
further than that and made it possible to use reverse execution during the
replay so GDB has some commands like reverse continue, reverse step that work
with things like breakpoints and normally in GDB the
implementations of those are either completely missing or really really
really slow but with RR we can implement those very efficiently. Basically the
idea is that you replay the program and take checkpoints periodically and then
you know if you do if you do a reverse continue command,
we can restore to a previous checkpoint
and then replay again
until we get to the part of the program
that we need to stop.
So that's basically what we do.
Yeah.
So what was the motivation to first create RR?
Right.
So there are really two motivations.
One is that we have a lot of problems with intermittent test failure.
So, you know, like most big projects, we have loads and loads and loads of automated tests.
When you do a Firefox check-in, you know, we probably run hundreds, maybe thousands
of machine hours of automated tests to validate that change.
Now, some of those tests have problems where once in a while,
the test will fail.
And this is because of a bug.
It could be a bug in the code or it could be a bug in the test.
It could be some non-deterministic weird thing that's bug in the test uh it could be you know some some non-deterministic like weird
thing that's happening in the system and the test farm and so these are these are frustrating right
because you know whenever you do a check-in you're running millions of tests and a handful of them
will fail and it's difficult to know whether that whether you cause that or whether it's
what's happening um and so these intermittency are horrible and they're hard to debug
because as we all know, debugging
bugs that only occur once in a blue moon
is really, really painful.
One of the goals here was to make
those bugs easy to debug
by running the
program until you catch one of those
test failures in RR
and then you can replay
that failure over and over again, get the same
failure, same execution, and you'll be able to figure out what's going on.
That's the idea.
So that was the first goal.
As well as that, though, we just want to make debugging more fun.
I mean, we all spend, certainly at Mozilla and I think elsewhere, we spend a ton of time debugging.
You know, we've got loads of bug reports
and we spend a lot of time
fixing those bugs and a lot of that time is debugging.
And so
we wanted to make that better and
it's pretty well known that
when you're debugging, you're
reasoning backwards from effects
back to causes. And so
that's naturally something you want to do
by running the program in reverse time,
reverse time order, right?
And we have different,
well, we've all developed different strategies
for simulating that.
We use logging or we run the program lots of times
or we have some guesses about where the bug might be
and we try to break there and see what's happening.
But we're actually always working around the fact
that debuggers normally only execute forwards.
And the reason they do that is because
it's hard to implement reverse execution.
So we had the idea always as well
that RR would let us implement
a really good reverse execution back backend that programmers could really use and it would be efficient.
And that's how it worked out.
I'm kind of curious.
You said these basically are features built into GDB already that you have made better versions of.
Did I understand that correctly?
That's right. So there's actually,
GDB has several different backends
implementing these features.
You know, there's,
if you just use GDB out of the box on Linux,
you can tell it to record program execution.
And it will actually save
some
execution to a buffer and then
implement its commands on top of that.
But there are a lot of pretty big
limitations with the one you get
out of the box. The main
limitation is that the way that that's implemented is
that when it's recording
program execution, it basically single steps the program and records the effects of every single instruction
uh to memory and so it's at least a thousand times slower than running your program normally
wow whereas uh with rr the overhead is more like like 1.5 or less times.
Wow.
So you can see there's a pretty big performance difference.
Another limitation is that with GDB,
because of the very, very intense logging it's doing,
you can only record a small section of programming execution before you basically run out of memory and explode.
So whereas RR, you can record it, you know, you can record
maybe a second of execution if you're lucky,
right? But with RR,
we can record hours of
execution, and we've done that,
and it's fine. Wow.
So does that mean that
standard GDB GUIs
that are out there work with these features
also then?
I don't think so at the moment
because I think these features are very rarely used
by X developers because of those performance issues.
And I don't know of any GDB UIs
that support this other than the command line.
I could be wrong.
I haven't looked at them.
I don't use them myself, so I don't know.
Okay.
I hope that they could be added pretty easily though.
All right, cool.
So you talked about the goal or one of your goals was to make debugging more fun what's the workflow like
exactly when using rr right so the workflow is that you you just run your program under rr
uh you just basically you rr and then you know your line. And then RR runs the program.
Everything should work as normal.
You can interact with it.
Because RR is low overhead, it should feel just like you're using it normally.
And it does.
It really works.
And so when you've finished recording your program, you can shut it down or just kill it, and then you can
use an RR
replay command,
which basically drops you into GDB
at the start of where
the program started running.
And then
you can basically just
run GDB on it like normal.
One limitation right now is that
you don't get the actual window
if you're debugging an interactive application
because the actual window is part of the side effects of the program
which aren't replayed.
We're not writing to files or writing to sockets during the replay
because obviously that would kind of be crazy
if your program did something dangerous or state-changing.
But you can get all the console output,
and of course you can set breakpoints anywhere,
inspect anything,
and of course you get those reverse execution features
when you want them.
There are other extensions to that workflow that we support.
So you can, for example,
tell RR to replay the program
and log an event number
before every line of console output.
And then you can replay to a particular line of console output.
Interesting.
We'll start the replay.
We'll basically seek to that particular point and start replay there
so you can figure out what caused some particular line of output to be produced.
And yeah, that's basically,
those are the most useful
workflows at the moment.
It's quite common also to,
if you've got a bug
that's hard to reproduce,
you might run RR
on your test framework
over and over again,
you know, on your test harness
until you see the bug.
And then you might,
in between each run,
if you've seen the bug,
you would stop.
But if you haven't seen the bug,
you kind of delete that recording
and just try again, so that sort of workflow is also
very common. And one more workflow
that's common is if you're debugging a
multi-process workload, which we do a lot
in Firefox, because Firefox is
sort of multi-process now at least
in nightly builds
and so if you record
a workflow with lots of processes in it
you can get like a PS command that shows you all the And so if you record a workflow with lots of processes in it,
you can get like a PS command that shows you all the processes that were spawned during that session
and you can attach to a particular process and debug that.
Interesting.
So you talked about how you have all these automated tests
that run against every commit.
Are you now running those automated tests using RR
or do you kind of go back to RR if you see a problem occur?
We aren't doing that yet.
And that's because there's a couple of reasons for that.
We are looking into that.
We have people who are actually setting that up.
The main issue right now is that some test failures are hard to reproduce under RR.
It's because of the way that RR works.
We basically simulate a single core,
and then you can have an auto-threaded program,
but we only run one thread at a time,
and it's a technical thing to do the way RR is designed.
And that's fine for lots of bugs,
but there are certain kinds of bugs that are hard to reproduce that way.
And we're working on improving that situation, but it's kind of a research problem.
I can talk more about that later, what the issues are.
Partly because of that, it's not really a good idea to run all our tests under RR.
Okay.
Because that might mask some bugs that could really happen.
Some of the other issues are just sort of technical,
like just actually getting it running in our cloud environment.
All our Linux tests are running on Amazon,
and getting it running there is actually a problem as well, because most of the cloud providers don't enable
the hard performance counter features that RR relies on.
So you can't actually run RR in the Amazon cloud.
There is one company, DigitalOcean, has a cloud setup
where they actually enable those counters, so we can't run them there.
So that's another issue that right now all our tests are not running
on the right cloud.
I'm guessing you already answered this question,
but is QA running the application using RR,
or do they kind of find a bug and then developers use it?
It doesn't really reach QA yet.
We're just having developers mostly using R at the moment.
So you said it relies on the hardware counters.
That, I believe, implies it doesn't work in VirtualBox.
Is that correct? Do you know?
It doesn't work in VirtualBox at the moment.
It does work in VMware.
Oh, okay.
And it does work in Linux KVM.
So virtualization isn't inherently a problem.
Some VMs are able to virtualize the counters.
It's just that...
And Xen also supports PMU virtualization.
So VirtualBox doesn't.
It would be a neat project for someone to add a PMU driver to VirtualBox
since it's open source.
We haven't gotten around to doing that, but it would be cool to do.
Yeah, I could have used that for some performance monitoring I was doing recently.
Kept running performance tests and not getting any counts back,
and then I figured out why.
Yeah.
So we've talked about a couple of limitations
that R has, the single-threaded aspect,
not being able to run on some virtual machines.
Are there any other limitations that are worth pointing out?
Yeah.
One big limitation is that this is really only for Linux,
and it's quite interesting, I think, why.
I mean, RIR relies on recording users.
RIR records user space execution,
and it replays user space execution.
And we kind of need to understand
everything that's happening in user space.
And RIR also needs to understand the kernel interface
in considerable detail.
And in Windows in particular,
the interface of the kernel is an unstable
and sort of mostly private interface
that only Microsoft really understands.
And so Microsoft could implement something like RR,
but it's very difficult for anyone else to do that.
And it would also kind of be more work
because an interface,
as I understand it on Windows, changes version to version.
Whereas on Linux, the system call interface is very well documented.
It doesn't change.
It's a stable interface.
It's backwards compatible.
And any time you have a problem, you can actually dig into the Linux kernel
source code and try to figure out what the kernel is actually doing and that's very very helpful
so although the techniques of RR could be applied to other platforms but anything that's not Linux
is going to be a lot of work it's going to basically be a rewrite and on Windows especially
it'll be very very difficult to do the other limitation that i mentioned sorry yep
i was going to ask is rr able to run on mac os 10 no um again it could be ported and the port for
mac would be easier because you've got more source code available and stuff um but it doesn't uh it
doesn't run there again it would be a ton of work like okay i'm gonna have to read because most a
lot of the what RR does
is really about wrangling the system core interface
and the interface between user space and the kernel.
So it's a lot...
And those are different, quite different.
Another limitation, which I mentioned earlier,
is the single core limitation.
So if your program application is very parallel,
you use a lot of cores,
then RR is going to slow your program down massively.
It's also quite likely that the kind of bugs you're interested in won't even show up when you're using it because of that. So that's a limitation. I mean, obviously
there's, you know, Firefox can use multiple cores and we do have some, we do have parallelism,
but most of our tests don't really exercise that. They're focused on exercising specific pieces of DOM
and HTML and CSS functionality and other APIs,
and those don't really require multiple cores to perform well.
There's also a few extra limitations
to do with the kinds of kernel features
that RR doesn't support.
So the way that RR is designed
is that
we don't support
sharing memory between the application
that's being recorded
and things outside the application
like device drivers,
kernel device drivers, or
other applications,
other processes that aren't
being traced.
And that's just because of the way that RR works,
which is that... Well, I'll go into that later, maybe.
But because of that limitation,
that actually works fine on Linux.
You can turn...
Linux applications tend not to share memory
with the rest of the system,
except in a few really well-defined cases,
like X11 shared memory
or the Pulse Audio shared memory buffers,
and we can actually turn those features off
and there's some kind of fallback
for most of those features.
And that's what we do, and everything works great.
But if you have a program that needs to share memory
with some other thing that you can't record in RR,
and a particular case of this is graphics drivers.
So if you're using direct rendering on Linux,
then you might be sharing memory directly with the graphics driver, and that could cause problems.
So there's a few limitations like that.
That's basically it, though.
It's pretty general.
I mean, we've run a huge number of different applications
through RR successfully, and it's been good.
You just alluded to maybe that you could go into the mechanism
by which RR actually works.
Can you do that over the call?
Yeah, I can probably give you...
I mean, I think it's, yeah, I can.
Okay.
So, I mean, basically, the basic goal of, basic design goal of RR, which makes it different from pretty much everything else that's out there,
is that we wanted to build a tool that had very low overhead, and we also didn't have many resources to build it.
We only had a few interns and some half-time engineers.
So we took the approach in the beginning
that we weren't going to instrument code.
Most tools in this space,
pretty much every other tool in this space,
instruments code it.
You know, basically some kind of binary code rewriting system,
something like Valgrind or Dynamo Rio,
basically where you've got the machine code of your application
and you're translating it and adding instructions
into the instruction stream to record things and monitor things.
We decided from the beginning that we didn't want to do that,
and we decided to try to figure out how much we could do without doing that.
And two reasons, really.
One is that those systems are horrendously complicated and very fragile.
So, you know, every time Intel extends the instruction set,
which is like all the time,
you have to update your code rewriter to handle all the new instructions,
which is a ton of work.
And it also means you're always kind of behind, right?
Someone brings out a new compiler
and suddenly your tool doesn't work.
The other problem is that it solves its overhead.
Rewriting binary code is expensive.
You can get the overhead reasonably low,
but it's always going to be there.
And especially it's there when you've got self-modifying code, right?
And self-modifying code is actually incredibly common for web browsers
because we've got these JavaScript engines
that are generating code on the fly
and also patching code on the fly.
They do a lot of code patching for technical reasons.
They've got these polymorphic inline caches, is the buzzword.
And so, you know, rewriting binary code
is slow and difficult,
and we didn't want to do that.
So we don't do that,
and it's basically the key differentiator for RR.
And we use performance counters to measure progress.
So the key problem for these systems is,
if I've got some asynchronous event,
like an alarm going off and trapping to some signal handler,
we need to be able to replay a certain amount of program execution
and then deliver that interrupt
at exactly the same point where it happened during recording.
And that's difficult to do.
That's really the hardest single problem we have.
And so you need to be able to measure progress accurately through your program.
And that's what we use performance counters for.
So in RR, we actually count the number of retired conditional branches.
And that's our measure of progress.
And we combine that with the state of the current registers.
So we say, okay, stop the program after we've executed this number of conditional branches
and the registers look like this.
And then
we record that and then we replay
and during replay we actually get that
and we can do that with no-code
instrumentation by counting those branches
using the performance counters.
That's pretty all I need to say about that.
Okay. Very interesting.
So what's the future look like for RR?
Yeah, so there's tons of stuff that could be done with RR.
It's a really powerful sort of baseline framework.
Once you've got this record and replay
and also checkpointing capability,
you can build a ton of interesting stuff on top of that.
I really hope that people actually pick it up
and do some of those things with it
because we at Mozilla don't necessarily need
and don't really have the time
to actually build everything that you could do.
So some of the projects
that I probably would like to do at Mozilla,
one of them is really to focus on this problem of reproducibility.
We do have bugs that are hard to reproduce
that don't reproduce on RR easily,
and I want to understand more about why that is
and try to tweak RR to make those things more reproducible.
We believe that a lot of those bugs are to do with the way that the scheduler works.
RR's scheduler is pretty deterministic at the moment,
and maybe we need to introduce some randomness to the scheduler.
But there are probably other things as well that are going on.
So we want to figure out that problem.
As I said earlier, it's a research problem.
No one really knows how to do this, as far as I know.
And so that's something we're going to have to investigate.
Another, well, there's so much more that could be done,
I don't even know where to start.
But another big, big area is sort of dynamic analysis
of the replay trace.
So imagine that you want to use something like Valgrind.
Valgrind, I think is the correct pronunciation.
Imagine you want to use it to find memory leaks
or memory errors in your program.
RR is good for debugging a bug once you've found it,
but you might want to apply some dynamic analysis tools
to actually detect bugs.
Well, we could actually do that during the replay.
So you could record the program with really low overhead, and then you could turn on some
instrumentation during replay to search for bugs.
And that would be kind of neat because it means that the low overhead recording would
mean your program wasn't too disturbed by the tools you're running,
and then you can run the other stuff offline at your leisure.
That would be awesome.
And a framework for doing that could be built,
and it wouldn't be that hard, but it's just a bunch of time.
Those are a couple of things that would be great to do.
I could definitely see utilizing that.
I was also wondering if maybe it's possible to monitor things like memory usage
over time or something during the replay.
Yeah, absolutely.
Performance analysis
during recording
and replay would be pretty interesting.
Multi-memory usage
is one of those things that would be easy to do
with instrumentation if you could instrument
the replay only.
Also, because RR is relatively low impact
on the program execution,
you know, you could actually imagine,
well, I've actually done this, actually.
You can run a profiler during recording,
and then you can replay
and compare the profile to the replay.
You can say, well, you know, we're spending a lot of time in this piece of code, you know,
and then actually debug that, you know, without...
And you've got your performance results already,
but now you're debugging the exact execution that produced those performance results.
That's pretty cool.
Okay, I might have to play with that.
So, I mean, this is all specific to gdb and as we mentioned during the news and stuff everyone's talking about clang and llvm and whatever is there plans i don't and does it work with lldb
i don't really know how these okay i i don't know i don't i don't there's any fundamental issues there. I don't know if LLDB has the same kind of retargetable
reverse execution backend that GDB has.
If it doesn't, that could be added.
And when it does, R could be added to support that.
R really works at a very low level.
So almost all the machinery of RR is pretty generic.
It works on any binary code.
It's not C++ specific.
It's not language-specific or compiler-specific.
In principle, you can debug all kinds of things with it.
We have been debugging Rust programs with it.
Some people have.
And, of course, Rust is LLVM-based.
So LLVM itself is not a problem.
LLDB would just require a bunch of work to plumb the interfaces through.
Okay.
Are there other tools in the space similar to RR that you're familiar with?
Yeah.
So there are, and there's been a ton of academic research in this area,
and there's also been a few commercial projects
that have done this.
And I think the most important one is
there's a tool called UndoDB by Undo Software,
which actually is very, very similar functionality to RR.
The way you use it also is fairly similar.
And it also plugs into...
It's also Linux for CIFIC.
It also plugs into GDB.
You know, it's a great tool.
And I think that, you know,
as I understand it,
you know, the main difference between RR and LDB is...
Sorry, the main difference between RR and UndoDB
is that UndoDB uses code instrumentation to do its work,
and so it has a little bit higher overhead than RR.
It's also, they have to do,
that's also more work for them to keep it maintained
and up to date with instruction sets.
The one thing that does have that RR doesn't have
that's really important is it has ARM support.
And, you know, there's some deep technical reasons that I probably shouldn't have that's really important is it has ARM support. And there's some deep technical reasons
that I probably shouldn't go into now
why on ARM we can't actually do RR
the way that we've done it.
And so you need to use code instrumentation on ARM,
and so under DB works there, but RR does not.
And so that's a good option if you're on ARM.
Okay, so I was wondering before if RR would support Android, so I guess not. So that's a good option if you're on ARM. Okay, so I was wondering before if R
would support Android, so I guess not.
Right, well, Android x36.
Okay.
Not really what you're looking for.
But yes, we did look into ARM,
but there are some really cool and interesting
and deep and unfortunate technical reasons
why they didn't work. They could be fixed by the
ARM guys. I hope one day they will fix it.
But anyway,
so, yeah, the other tool that's really interesting and related to this is that years ago,
VMware had a VM-based record and replay engine where they record and replay the execution of an entire virtual machine.
And you could actually run Windows in this.
And so you could get record and replay for Windows.
And we actually used that at Mozilla to deal with some really horrible bugs,
and it was pretty good.
I mean, it wasn't quite all there.
Unfortunately, they cancelled the project
before they really got the engineering really solid, I think.
But I was very sad they cancelled that.
It was a great system,
and we actually, I believe that at Mozilla
there is still somewhere a Windows machine set up to run that.
We've taken it off the network so that it doesn't get any Windows updates
because otherwise it will stop working.
But, yeah, that was a cool project.
All right, so this is a C++ podcast.
RR is written in C++, is that right?
That's right.
So as a C++ developer, we've got to ask
you the standard question of do you have any
favorite features of C++
or favorite annoyances of C++?
Yeah.
C++ is cool.
I really like
templates. I like the metaprogramming,
compile time, all the compile time stuff that you can do. I like the metaprogramming, the compile time,
all the compile time stuff that you can do.
I think that's really great.
Being able to generate code essentially,
the exact code that you want
for a particular data structure or whatever
is specializations and things like that.
That's really neat.
I'm not a huge fan of C++
to be honest with you guys.
I think the new language features are cool,
but the problem I see is that the complexity only grows
and it's already incredibly complex
and it's just getting more and more complex.
And, you know, you really have to study to keep up, right?
Like, you know, I had to sit down, you know,
for an hour or two and like figure out,
you know, R value references. And I'm sure that I don't understand all the interactions they have
for the rest of the language. So just, you know, I would really like, you know, people have talked
about having a, at some point, you know, a restricted subset of C++ with all the good parts and the bad parts kind of disabled.
And I think that would be really great.
You know, I also, you know, because we use C++ to build a web browser
and web browsers, you know, security is incredibly important.
And the unsafe C++ features, which, to be frank, you can't really live without,
are a problem as well. So, you know, I would love to see, you know, a safe version of C++.
You know, at Mozilla, we've got, maybe I shouldn't mention it, C++ podcast, but we decided that we
didn't really want to build, you know build another iteration of a web browser in C++
because of the safety issues,
and so we built this Rust programming language
which tries to deal with a lot of that.
So I think those are the things that I worry about with C++.
But I do enjoy using it.
We use C++ for Firefox for a reason.
It's not just legacy.
Right now, until now anyway,
C++ has been the only language
you'd be saying to do something like this in.
It's okay.
It's okay, yeah.
Earlier, when I read the piece of feedback,
the listener talked about how Mozilla
has lots of other projects
that might be interesting for C++
developers. Are there any you could maybe mention? Obviously, we can't go too deep into anything.
I think WebAssembly, what used to be called Asm.js, and now I think it's turned into WebAssembly.
I think that is a pretty interesting thing. I don't know if you guys have covered it.
I haven't checked, but... Yeah, we had Jeff Bastion from Google a while back
talking about WebAssembly.
Okay, cool.
Okay, so that's a pretty big deal.
So that's cool.
I mean, you might, if you guys are really bold,
you could invite a Rust person on
to talk about C++ versus
Rust. I think that's pretty
fun. Pretty fun podcast.
But maybe
you shouldn't.
We've actually had someone from Rust on too.
I have.
Thank you for the suggestion.
No, no. Cool.
Okay. Jason, is there anything else
you wanted to ask?
I think I covered all the questions I had.
Okay. Well, Robert, where can people find you online
or find more information about RR?
So there's the rr-project.org website.
It's the main landing page.
People can email me.
You've got my email address.
Also, we also discuss RR on IRC, if some people still use that.
That is irc.mozilla.org, hash research.
That's where we hang out for that. So I'm off on online there.
So those are the main places.
And we're always...
Also, we're on GitHub, of course, and
people should
download RR and file
issues if they have them.
Okay. Well, Robert,
thank you so much for your time.
It's a great pleasure. Thank you for having me. Thanks for joining us.
Thanks so much for listening
as we chat about C++.
I'd love to hear what you think of the podcast.
Please let me know if we're discussing the stuff you're interested in,
or if you have a suggestion for a topic, I'd love to hear that also.
You can email all your thoughts to feedback at cppcast.com.
I'd also appreciate if you can follow CppCast on Twitter,
and like CppCast on Facebook.
And of course, you can find all that info and the show notes
on the podcast website at cppcast.com. Theme music for this episode is provided by podcastthemes.com.