CppCast - Reverse Debugging

Starting point is 00:00:00 Episode 130 of CppCast with guest Dr. Greg Law, recorded December 13th, 2017. This episode of CppCast is sponsored by Undo. Debugging C++ is hard, which is why Undo's technology is proven to reduce debugging time by up to two-thirds. Memory corruptions, resource leaks, race conditions, and logic errors can now be fixed quickly and easily. So visit undo.io to find out how its next-generation debugging technology can help you find and fix your bugs in minutes, not weeks. And by Audible. Get a free audiobook download

Starting point is 00:00:34 and a 30-day free trial at audibletrial.com slash cppcast. Over 180,000 titles to choose from for your iPhone, Android, Kindle, or MP3 player. In this episode, we talk about deprecating features from C++. And we talk to Dr. Greg Law from Undo. Greg talks to us about how reverse debugging is becoming a thing. Welcome to episode 130 of CppCast, the only podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner.

Starting point is 00:01:44 Jason, how are you doing today? Doing okay. How are you doing you doing rob i'm doing all right you're uh your tooth gonna be okay uh yeah you know well for our audience i i popped a crown off this uh morning so i'm just waiting to go to the dentist to get it glued back on now well hopefully it'll be okay for the next hour it'll be fine yeah the next hour it'll be fine yeah well the time very sort of like through a piece of feedback and uh this week jason we got a tweet between uh you and then uh bartolome flip it i apologize if i butchered that name but uh apparently you got some of your episodes of c++ weekly transcribed yeah the last two episodes to go live, someone has done a community

Starting point is 00:02:27 transcription of it within 24 hours. Oh, wow. I'm not sure who it is, and I'm not even sure to see how to see who it is on YouTube. I don't know what I'm missing there. But I just did a quick review of it and approved it. So yeah, and then there's a couple of older ones that also have transcriptions and that's something just built in within the youtube platform someone can volunteer to transcribe someone else's video yes yes and there was actually um a talk from last year at cpp con about doing this and why it's important to support this community effort for conference talks. Right. Because if you're a non-native English speaker, understanding someone's spoken English can be difficult.

Starting point is 00:03:12 Being able to read it is that much more helpful. And on the off chance that someone else can then translate the English into another language, that opens it up to that many more people. And obviously there can be deaf programmers who would want to see that content as well. Yeah. That also, yes, of course. Yeah. So Bartolomaj was asking, are there any plans to transcript CPP cast episodes saying it

Starting point is 00:03:39 would also be handy if at least mark main points from the interview in time when they happen? I think I looked into transcription services, and it is quite expensive. So I don't think it's really something we could do anytime soon. But the idea of at least marking points in the conversation, I thought maybe we could do. You know, just say like, hey, we started talking about news here. We started talking about this topic. Let's transition to that topic.

Starting point is 00:04:04 I feel like we could probably at least do that yeah it's also um from the youtube standpoint the automated google text to speech thing gives you a kickstart to it and it's hypothetically possible that we could upload some of these to youtube as just audio tracks and let it do the transcription for us. That's true. We could also possibly record, but I don't know if anyone needs to be seeing us while we're doing this. No, no, no, no. Nope. Well, we'd love to hear your thoughts about the show.

Starting point is 00:04:36 You can always reach out to us on Facebook, Twitter, or email us at feedback at cpcast.com. And don't forget to leave us a review on iTunes. Joining us again today is Greg Law. Dr. Greg Law is co-founder and CEO at Undo. He spent nearly 20 years writing systems-level code, including novel kernel designs and networking architecture and academia, and at a variety of startups. Greg finds it particularly rewarding to turn innovative software technology into real business development. He still gets to write some code, although sadly most of his coding these days is done on airplanes. Greg lives in Cambridge, England with his wife and two children.

Starting point is 00:05:10 Greg, welcome back to the show. Hey, hi, thank you, yeah. How's the weather in Cambridge this time of year? It is, today is very gray and drizzly and really not very cool. But last weekend we had some snow. And that was the first snow. Like, that's really the first snow we've had in Cambridge, I think, for four years. Like, proper snow when you can make a snowman, right?

Starting point is 00:05:34 Wow. So, the kids were out enjoying the snow. I mean, they were beside themselves. We have a dog who's, like, less than two years old. So, that was the first time he'd seen snow he he had no idea what to make of it um so that was good fun but now it's like there's just you know you get those snowman corpses in the in the power and it's all kind of back to normal apart from that that makes it's very interesting weather year then i guess because here in denver we've

Starting point is 00:06:01 had virtually no snow which is weird we should have should have had more than one snowfall by now. And I have family in Atlanta, which they had like four inches and were making snowmen and snow forts and stuff, which is highly unusual for them. Did you get hit by that snowstorm also, Rob? Yeah, I was just going to say, I know Atlanta had that snow. We did not really get hit by it. We got a little bit of slush uh but that's about it although my kids you know they grew up in new jersey and now we live down north carolina so they're quickly forgetting what snow really looks like um when atlanta was getting the four inches we just had some frost the next morning and my daughter

Starting point is 00:06:41 was asking if she could go out and play in the snow and i'm like no honey that's that's just some frost on the grass it's not how it works yeah no make a frost man it would take a lot of effort a lot of effort yeah well uh greg we got a couple news articles to discuss uh feel free to comment on any of these and then we'll start talking to you about what you've been up to with undo okay yeah awesome okay so this first one is actually just the start of what looks like it's going to be a multi-part blog post from uh jeff amstutz who was on the show a while back and this is about building a c++ simd abstraction and it relates to the work that he's doing on osprey. Yeah, there's a lot of information in here. I didn't have the time to really dig into it yet. But I had also recently

Starting point is 00:07:30 been playing with when the compiler can generate SIMD optimizations for you. Yeah. And like I said, this is kind of like an intro to what looks like it's going to be a series of posts. So maybe we can talk more about it as he continues along with the series but uh this first one is kind of just like an intro to the simd concepts and it's uh definitely worth reading right yeah uh next up from the visual c++ blog we have a nice long post from stl talking about feature removals and deprecations and basically just talking about now that we're getting into C++17 and they've introduced the

Starting point is 00:08:10 std c++17 flag or std c++ latest what you can expect to see deprecated and eventually just removed entirely from the visual C++ compiler. This is great. Yeah. Removing stuff is great. It's almost as great as getting new

Starting point is 00:08:26 stuff yes that in some ways better possibly i just wish we could deprecate standard bind which will never happen probably but with c++ 14 lambdas it's virtually useless yeah so i guess they have a good sense on which features are very rarely used so they can remove it without causing too much carnage, right? Generally speaking, they're features that either were never really used or features that were actually problems. Like the auto pointer. Auto pointer has never been possible to use correctly, basically.

Starting point is 00:09:05 Right, but even... Go ahead. What's that? I was just going to say, even if you rely on some of these older features that are beginning to be deprecated, you could still compile as std C++ 14, is my understanding. Yes. And there's flags for many of them, preprocessor defs and stuff, that will let you turn them back on if you really wanted to.

Starting point is 00:09:27 Yeah, you can restore autopointer by defining has autopointer etc. to 1. Don't do that. Right. Okay, and then the next post is on Simon Brand's blog, and it's kind of a follow-up to one we were talking about last week, where we were going back and forth between exceptions and error handlings and what should be the best practice. And I found this post to be pretty interesting, especially one thing he mentioned in the beginning

Starting point is 00:09:58 where he talked about how we shouldn't just use one or the other based on what he referred to as cargo cult software engineering, which is a term that I probably have heard before, but haven't read the genesis of. Jason, are you familiar with this term? Of cargo cult? Yeah. Yes. Yeah, yeah. I'm reluctant to describe it fully here because I don't know if I'd get the details right but yeah uh it stems from like world war

Starting point is 00:10:26 two world war two setting up like military bases on these uh islands and then aboriginal people imitated the uh the people trying to land airplanes and such after the the bases were closed i thought it was pretty interesting yeah so some of the people left behind have done things like make like bamboo airplanes right and an airfield left and stuff yeah yeah i heard i heard of cargo cult for the first time a week ago at a conference in london devrelcon and now here i'm here i am hearing about it the second time so i guess i guess it's becoming a meme maybe yeah i don't it's been as many years since i first heard it but i i haven't heard it very often yeah and basically the the point of it is you know don't just follow what's you know maybe good advice

Starting point is 00:11:18 without fully understanding the the reasoning behind it like if you're programming something and and you are told this is a best practice, you should kind of understand why it's a best practice and not just imitate it blindly. Or even worse, just doing something because you see other people doing it without even knowing why they did it.

Starting point is 00:11:35 Exactly. Or if they even pretended it was a best practice. Right. And that's what this post is all about. He thinks, you know, before we make too many decisions on the types of error handling or exceptions we should use, we should kind of understand it better. And he's kind of making a call for people to try to gather some data about it. Two years since we first had you on the show, could we start off by just refreshing our listeners about what UndoDB and LiveRecorder are?

Starting point is 00:12:09 And maybe if there's anything new product or feature-wise that's come out over the last two years, you could mention those? Yeah. First of all, I can't believe it's been two years. If you'd asked me to guess, I would have said earlier, beginning of this year or something but uh uh anyway yeah i guess time time flies when you're having fun um uh yeah so just so just a reminder so so uh live recorder and undo db both are products that allow uh developers of c++ code on linux to to record the execution of their program

Starting point is 00:12:47 and then wind that tape back and forward. So the viewer onto your recording is a debugger, typically GDB, and then you can kind of wind the tape back and step back. You can step back to any line of code that executed, and you can see full program state for any point in history, right? So it really allows immediate and clear answering of the how did that happen question that we're all familiar with as programmers, right? Never works first time. And more often than not, we're kind of scratching our heads trying to figure out why reality kind of diverged from our expectations

Starting point is 00:13:40 and where reality diverged from our expectations. So having the tape that you can just kind of wind back and see precisely what happened is a really powerful way to answer those very difficult questions. So people often sort of, sometimes people see this and like the first thing they think of is just, you know, that extreme frustration when you're stepping through in the debugger and you accidentally step over one too many function calls and it's like, oh, I've gone too far.

Starting point is 00:14:10 And so their immediate thought is, oh, right, great. So I can back up and remove that frustration. And that's certainly true. But really, that's just the tiny piece of the advantage of this. The advantage of this is to be able to know what the computer really did, right? To know what the previous states of your program were and be able to kind of navigate through history back and forth. So just kind of unpicking, stepping too far is just like 1% of the overall value. Going back, if you've got memory corruption, if you've got race conditions, if you've got stuff like that, and these kind of stateful errors where the problem itself is a long time before when you notice, then no longer having to guess how you got, but just go back and see what that was is very powerful.

Starting point is 00:15:07 So what is the granularity of this recording? So this is, it's everything, right? I mean, there's nothing about the program's execution at the logical level that you cannot know. So it's not like every CPU instruction or something like that. Oh, yeah. No, no every cpu instruction now so when i say the logical level so like maybe there's you know like cache line misses you know inside the silicon and that's you don't see that um but you can see absolutely you know you can step forward step backwards instruction by instruction see you know any any any address in, any instruction that executed.

Starting point is 00:15:48 So it's full visibility. I mean, if anybody's familiar with RR, it's a very similar concept to that. And likewise, a few other things that have been announced since we last spoke mean, I think the big thing that's changed over the last couple of years is how, like, this is starting to become a thing, right? So product-wise, you know, we've been focused very much on, you know, robustness and performance, which are the two big things, right? So can you give this to, you know, like arbitrary, really complex code

Starting point is 00:16:27 and, you know, have it handle that and not fall over and also do it with good performance. And so that's been, you know, really the main progress. We have divided LiveRecorder into two products. So just very briefly, UndoDB is like basically an interactive debugger. It works just like GDB, except you can go backwards and forwards.

Starting point is 00:16:55 In fact, just in the way that you can with GDB now, go backwards and forwards, but the performance characteristics are totally different. And then LiveRecorder is this this kind of allows you to do this offline debug so you you record the execution of the program and and then you know come back and debug that the next day or a week later or whatever it is um and you can debug on the original machine where the recording was made or you can you can debug on another machine um and uh and we've kind of divided that out into

Starting point is 00:17:27 live recorder for test and live recorder for production so live recorder for test is aimed at plugging into continuous integration and other automated test environments where you know the kind of premises you know you plug it into to your jenkins or whatever it is that you're using and when you have a failed test you get like instead of just having you know maybe some log files and perhaps if you're lucky a core file with the artifacts what what you get is a recording file which gives you you know full visibility into what that test did um and so particularly for intermittent um uh or test failures or tests that fail in in different ways each time uh that's obviously very, very valuable. And then there's a live recorder for production,

Starting point is 00:18:09 which is really aimed at software vendors to kind of link that directly into their product and then ship that or deploy that and have the ability to then have the program record itself, basically. So it's kind of packaging and positioning of the things that have changed and the use cases, I guess. So what are the performance and disk space implications of turning on the live recorder?

Starting point is 00:18:38 Yeah, right, right. Which is exactly the right question because making this stuff work isn't that hard. In fact, there's research papers on this going back to the 1970s, where people are going to prototypes, and something for toy languages and toy programs. So having it work on Hello World is not too bad. But if you think about what you need for this,

Starting point is 00:19:05 you can have this full visibility. Every time a CPU executes an instruction, it's destroying information, right? Some information, like, no longer exists in the universe. So at a minimum, the program counter is changing or some other register is changing. If executing an instruction doesn't change any state, then you're, by definition, in an infinite loop, right? Because the next instruction is bound or some other register is changing if executing an instruction doesn't change any state then you're by definition in an infinite loop right because the next instruction is bound to do the same thing um so some piece of state you know at least like at

Starting point is 00:19:32 least a word possibly several is changing every time every instruction that executes just to make the math really simple it's a it's a gigahertz machine it's issuing a billion instructions every second that's you know that's that's a huge's a gigahertz machine it's issuing a billion instructions every second that's you know that's that's a huge amount of you know even if you're just record even if you're just storing the deltas between the instructions that execute it's a huge amount of of storage it's gigabytes per second uh and uh and and and and and and the performance characteristics are going to be pretty awful as well if you do that just the amount of time it will take to store that information as you change it so so instead what we do is this technique that we that we pioneered of uh replaying the program execution

Starting point is 00:20:15 so if you've been if you've been running for 10 seconds and you want to go back one second well rather than try to go back we just go back to zero times zero and play forward nine seconds right and then there's some snapshotting stuff along the way to avoid needing to go right back rather than try to get back, we just get back to 0 times 0 and play forward 9 seconds. And then there's some snapshotting stuff along the way to avoid needing to go right back to the beginning. And so that gives completely different performance characteristics than you might expect. So it means that space-wise, you know, it depends very much on what the program is doing.

Starting point is 00:20:43 If you're reading, it's basically the inputs your program is taking. So what you have to do to make this work is capture the non-deterministic stimuli as they come in, right? So if your program is reading, you know, 10 gigabit data stream, then this is going to be, the recording is going to be, you know, of the order of 10 gigabits per second. But most programs are nothing like that I-O heavy. Actually, it's the I that counts, not the I-O. So much more typical is if the order of, say, like a megabyte per second or something like that is kind of typical.

Starting point is 00:21:20 And speed-wise, again, varies a lot depending on what the program is doing, but again, running at half speed is again typical. Okay. And running at half speed isn't that bad for something that you just turn on in order to get a log file that you can solve the bug with.

Starting point is 00:21:40 Right, absolutely. So in the context of your CI or other test environment, that really shouldn't be an issue at all, especially as you can turn it on just for those intermittently flaky failing tests or you run some of our customers run two lots of their tests. If they're running this continuous integration, they just run it twice, once without recording, once with recording.

Starting point is 00:22:09 If they can capture some of those failures, then they're happy. So they have those running in parallel. And yeah, even for your desktop day-to-day debugging, if it's running at half speed and you save one restart, then you win, right? Right.

Starting point is 00:22:29 So I've noticed that if you have a debug build that's simply just faster than a release build because less optimizations are enabled and the binaries are bigger. But I've noticed in my own experience that if I'm running a debug build inside the debugger and the debugger is actually tracing the program execution that tends to be slower again but I've never actually measured that and I'm just curious if you have any idea like how much slower the a regular debugger is versus you know what you're doing with with a little bit of the tracing and rollback ability.

Starting point is 00:23:05 Yeah, like all of these things, it's the same kind of politician's answer of, you know, well, that depends. Right. But no, there certainly are times when, you know, the debugger performance can be pretty impactful, particularly, you know, for example, things like loading a shared library so if you if your if your program loads a loads of dll or a shared library then um you know typically that will trap back to the debugger and um right debugger will have to do a whole bunch of work to kind of you know parse that library and figure out you know and especially if it's some you know maybe it's got a load of debug info with all kinds of you know c++ templates and other stuff that can expand quite a lot.

Starting point is 00:23:49 And so that can really impact performance. So certainly loading a shared library can be orders of magnitude slower inside the debugger. Once you're up and running, and this is all really about Linux. I'm not familiar with doing this on Windows, but I'd imagine it's the same. Once you're up and running inside a debugger, if you're not hitting internal breakpoints and the like, then it's likely to go,

Starting point is 00:24:16 you would expect it to go at full speed with kind of zero slowdown compared to if there is no debugger attached. It's when they yeah they you know so for stuff like you know when a shared library is loaded it'll hit the what the debugger will do is set a break point inside the loader so every time that hits then there's a break point and every time you hit a break point or something that you know that can be very slow right that can be the equivalent of millions of instructions you know by the time it's trapped

Starting point is 00:24:43 down to the kernel controls come back up the debug been scheduled, it's done whatever it's going to do, it then goes back into the kernel, continues the program being debugged. You know, if you're doing that once a second, it's fine. If you're in some tight loop doing that all the time, then that will hurt you. Okay. I want to ask more about the live recorder for tests. What does it take to integrate that into your CI? Is it a pretty straightforward process? Yeah, I mean, it's, yeah, we've absolutely, we always designed this with the clear goal that it needs to be as simple as possible, right?

Starting point is 00:25:17 So, you know, it's always been, you don't need to, you know, build in any special way, link against any special libraries, whatever. So we actually have two forms of the recording, two ways to kind of integrate that. So one is a library that you can just link against and then that has an API and you can start recording and stop recording

Starting point is 00:25:41 and save the recording to a file and do other things, configuring the size of logs and all that kind of stuff. And then the other model is a kind of external recording tool. And then that works more like S-Trace or even RR. And then will record, allows you to record your program kind of completely from the outside without needing to uh you know make any any code changes at all and so they have different trade-offs right um about which one is going to be more convenient for you whether you want the the flexibility of the api or whether you just want to not change your unit tests at all and so you can kind of pick and choose and then it just goes as a

Starting point is 00:26:20 as a pipeline stage you know in your in your Jenkins or whatever, right? So, you know, yeah, getting it set up is, you know, it's really not a complicated thing at all. Okay, so if you're not using the library integration, it's as simple as, you know, start LiveRecorder and then run all my unit tests and then finish LiveRecorder, that type of thing? Yeah, indeed.

Starting point is 00:26:44 So, you know, so it's like literally in the simplest case, it's literally just prepending your... Say your test is an executable, you're just going to run that, and it's just prefixing live record, and then the test invocation, right? So it's just like running strace. And then obviously that gives you different trade-offs with,

Starting point is 00:27:10 then you have to record the whole test, which you may or may not want to do. And then you kind of want to configure about whether you're going to save that recording or throw it away. And there are kind of options you can do to to to control that to make it a little bit easier but yeah always from you know this has been designed you know we always a really important design requirement for us that um you know that that you just you want you need this just to be as easy as you can to get going um because you know we

Starting point is 00:27:43 just knew that we knew that every thing you know every extra hoop that the developer uh or the devops person or whoever it is needs to jump through um you know then you're like you you just you're going to turn away like you know half your potential audience or worse so as you mentioned um over the past few years it seems that uh reverse debugging has become more of a thing um what are some of the changes you've seen in the industry uh that make you feel that way um yeah i mean so so as well as you know uh projects and products um they products coming along, which I think we mentioned the main ones of RR

Starting point is 00:28:29 and also the time travel debugging from Microsoft, which was announced by them at CppCon this year. And there's other stuff as well. So those are the kind of C++ ones, but there's other... The PyPy project has some reversible debugging built into it now,

Starting point is 00:28:47 although that's still experimental, I think, but it's there. And actually Microsoft have time travel debugging for JavaScript through their Chakra core engine. So there's like, you know, it's happening in a number of places, and I apologize to anybody if I've forgotten to mention you, but I think those are the kind of main ones. And of course, inside GDB as well, although that implementation is, you record, works by single-stepping the program and recording the changes every instruction executes, which has the benefit of simplicity

Starting point is 00:29:31 and works pretty well when it works. It's just that it's so slow and also generates so much data. So what it means is you're kind of limited to if you know there's a piece of a bug right here you can kind of turn on the recording for that bit and and and as long as it's you know just a sort of i don't know million instructions or so then then you know you're fine i think it's also used a bit if you're just trying to explore some code you know if someone's written if you've got some, you know, really clever, in inverted commas, piece of C++ code that's, you know, doing something that you just can't figure out, like, what it's doing, then that can be useful. But, you know, that's kind of limited.

Starting point is 00:30:16 And I think while it was really just us and GDB, that was actually problematic for us. A lot of people would try GDB and say, oh, yeah, you see, I knew it would be too good to be true. I knew this was too good to be true. And, you know, kind of immediately kind of – so it made a sort of credibility problem. I think now that you're seeing, you know, these other things coming along which are a bit more – using some of the same techniques that we use

Starting point is 00:30:46 and so, you know know essentially much more scalable um people realize oh yeah no this is real this can this can really work uh and uh um yeah it's becoming a bit more of kind of an accepted thing and we certainly come across it you know more and more it's still though to be honest it still surprises me how uh niche it still is right so you know despite you know so there's you know obviously there's there's our offering there's this you know open source stuff available on linux there's the stuff from microsoft inside um you know fair enough that's quite new but still it's you know it's there um and still i don't know it kind of feels like somewhere between one in ten and one in a hundred people that i speak to have certainly that few have tried it maybe even heard of it as a as a as a concept which still

Starting point is 00:31:39 you know surprises me because you know it's it's it it just is such a a powerful thing and changes it has such a such an impact like you know you know i'm sure you and a lot of your listeners will have heard the nice pithy quote from um i think it's from brian kernigan that the debugging is twice as hard as writing the code in the first place so uh if you write the code as clever as you can, how will you be smart enough to debug it? Right. Which is a very useful piece of advice, especially for kind of, you know,

Starting point is 00:32:12 programmers kind of early on in their career who are trying to, you know, get a bit too clever, as it were, which I think we've all been through that phase. But it's interesting, because if you think about what that means, the implications of that statement it's like well so okay so that means then you can only write the code that you can

Starting point is 00:32:30 debug so debug ability becomes the limiting factor on on on the code that you can write right it's limiting fact and whatever your metric for good is, whether it's fast, whether it's extensible, whether it's portable, whether it's whatever, the limiting factor of that is how debuggable it is. So it's always amazed me that debugging in general, I think think gets such little attention as it does and um you know and particularly the you know and yeah and particularly the reversible debugging still kind of remains yeah you know just not that well known and i i think it's changing and and i certainly very different than it was you know four or five years ago um and and i think that change will continue and i fully expect over the next you know i don't know two two to five years let's say i think it will become just kind of you know accepted normal practice and like any

Starting point is 00:33:37 environment that doesn't allow you to do that will be um you know painful um uh um but but painful. But yeah, we were still not even anything like halfway through that transition. It reminds me of some of the topics that Kate Gregory has talked about where she has been saying to not even necessarily teach your students printf or cout or whatever straight away to look at the output of something. Instead, teach them in the debugger to put a breakpoint there and look at the state of that variable and start right out from the beginning understanding how these tools work. Yeah, yeah, absolutely.

Starting point is 00:34:16 And I know there's the school of thought that debuggers can make know can make programmers lazy if you like and and and you know um but i i don't i don't subscribe to that i guess obviously but but it you know i think i mean there's there's it reminds me of there's a there's a view that um uh allowing students to use calculators when they're you know it's kind of cheating right cheating, right? And there's good research out there that shows that actually if you allow kids to use calculators, their mental arithmetic gets better because they just do more math than they just, you know, they're just practicing this, they're just becoming part of it. And it just, it helps them, it takes the kind of the chore away from maths, right? And it makes it more enjoyable.

Starting point is 00:35:06 So, you know, and it's this kind of, I don't know, sort of, what's the word, sort of masochistic kind of, you know, real men don't use debuggers, right, and they sort of go through the pain. I just think that's daft, right? If you've got these tools. And it's buggers and, you know, it could be static analysis and, you know, dynamic analysis with Valgrind and whatever it is. You know, we have these tools.

Starting point is 00:35:34 Like, you know, programming is hard enough as it is. Like, we don't need, like, some kind of monastic, self-inflicted pain to make it worse. Yeah, I definitely agree. So we mentioned Microsoft's announcement of their time travel debugging toolkit. Obviously, that one's Windows only, and Undo is Linux only. Do you see many comparisons between the two, though? Are you picking up any ideas from what they did

Starting point is 00:36:05 yeah it was it's it's um it's definitely interesting stuff it's a different it's a it's a it's a different technique than we use so what they're doing although they don't um i'm just actually as far as i know this none of this is is published particularly but i'm interpreting from the outside um and um you know i i it looks like what they're doing is um uh they they are using dynamic to jits instrumentation in a in in a similar way that we do um but all that we record as i mentioned at the beginning all that we record is is the non-deterministic inputs that come in right so so essentially we exploit exploiting the natural determinism of computers so if you run a program multiple times and every time you give it the same starting state and uh and as you run it you feed it the same inputs then it will

Starting point is 00:37:03 always do the same thing right this is why random numbers are difficult to generate um and and so we exploit that and and so we you know when when we're recording we which all we record is the non-deterministic inputs which is mostly system calls but there's also you know thread switches and signals and shared memory and and and some cpu instructions as well um well. And then we replay those precisely at the appropriate time during replay and guarantee that the program follows exactly the same path as it did in record.

Starting point is 00:37:36 The Microsoft implementation is, I think, a bit different. I think what they're doing is to instrument the code but to record all of the memory reads. So every time it reads from memory into the CPU, you log what was read. I suspect they're doing something a little bit clever and only storing what was read if it's different from what was accessed last time.

Starting point is 00:38:02 But yeah, I think that's essentially what they're doing which gives different characteristics in terms of performance so so significantly more uh data hungry if you like that the log files get a lot bigger so the log files kind of grow off i think if i remember rightly they're saying it's a it's a um it's a bit per nanosecond of execution, which I think roughly equates with that approach. And the slowdown is a little bit worse. It does have the advantage, though, is you can record genuinely multiple cores at the same time,

Starting point is 00:38:40 which is pretty neat. And I must say also they've done a great job in the stuff they've built on top, which is kind neat. And I must say also they've done a great job in the stuff they've built on top, which is kind of interesting. So you can query using sort of SQL-type language. You can query the recording and find what the... Rather than sort of stepping through in the debugger, you can write some SQL and find like, you know,

Starting point is 00:39:05 I don't know, show me how this particular variable changed state during the execution of the program. So that's kind of neat. And I think they've used it a lot internally, right? So although saying this is actually sort of announced as a new thing, it's been used internally to microsoft for uh many years and um and i and i think at the uh at the announcement in cpp con they said that over 50 percent now of internal microsoft bug

Starting point is 00:39:34 reports come attached with a recording a trace file as they call it you know as opposed to like you know some instructions of you know do this do, do the other, and then it died. So I guess inside Microsoft, it's not a niche thing at all. It's already made that transition. So yeah, very exciting to see how that spreads to the rest of the world. Yeah, so based on what you said earlier about how many developers you run into who are aware of reverse debugging, it seems like it must be exciting for you just in general to see that it's getting more and more awareness. Yeah, yeah, absolutely.

Starting point is 00:40:13 And, you know, I think for quite a long time I did feel a little bit like a lone voice in the wilderness. Right. bit like a a lone voice in the wilderness um and um uh yeah and this is i mean because we i mean i i started uh tinkering with this stuff back in 2005 um it wasn't we weren't really a business for quite a long time it's like a project you know evenings and weekends project with me and julian smith my co-founder before we kind of made it a proper company in 2012. And I kind of quit the day job and started working out of the shed in my garden full time. But, you know, and, you know, in the intervene,

Starting point is 00:40:56 certainly in that intervening seven years from, you know, I remember just thinking I had a conversation with, let's call it, I don't know what you'd call it, like a business advisor type person back in the early days. He says, yeah, I'm going to build this thing, and it's going to be totally awesome, and everyone's going to want it. And I remember him saying to me,

Starting point is 00:41:17 well, you know, have you thought about marketing? You know, how are you going to get the word out there? I said, oh, no, no, no, it's fine. It'll just go viral. Like, this is so powerful. Everybody will just, you know. And he said, he's well you know everyone kind of says that so yeah yeah but we're different it's yeah everyone thinks they're different yeah i'm sure everyone does think they're different but like this really is different uh and and and remember even even um uh this is a bit embarrassing but when we when we we put, we announced it, we put it up on the website

Starting point is 00:41:46 and said, here it is, come download a trial and all that stuff. And I actually warned our web hosting provider that we were going to go live and there could be a big load on the server. And they said, yeah, okay, it's fine, I think we've got this.

Starting point is 00:42:02 And I'm not sure what I expected them to do but i just sort of felt i should warn them and of course like you know absolutely nothing happened right it's just tumbleweeds um and uh uh yeah and so like there was some i might you know i admit there was some dark moments in you know been when i've been plugging away jules and i had been plugging away at it for like you know five six five, six years. And, you know, you're just thinking, wow, still no one's paying attention to us. Am I just like deluded? Is this just really not actually that useful after all?

Starting point is 00:42:36 So, yeah, it's very gratifying now to see that I wasn't mad. I was just impatient. Right. That kind of transitions into my next question. At CPPCon this year, you actually talked more about the business aspect of Undo and your process going through that. Unfortunately, that talk wasn't recorded as far as I can tell. Do you want to tell us a little bit more about it? Yeah, right.

Starting point is 00:42:59 It was a lunchtime talk. So, yeah, not recorded. It was – I mean, fair enough because it's a c++ conference and clearly the talk wasn't really about c++ so it was one of the kind of open sessions but it was very well attended i was very pleased with um how many people actually came to see it do you want to tell us a little bit more about the the topics you discussed in that talk yeah yeah so um i mean it's so the main part being i think it's you know a lot of a lot of uh a lot of software developers are kind of thinking along the lines of well you know maybe

Starting point is 00:43:31 you know it's an attractive thing to go uh you know to go and do a startup um and uh you know see if you can and whether it's you want to start a business so that you can get the income to write the code that you want or whether you want to write some code so that you can make the business you want, it's a common thing to do. So basically my talk was saying to everybody, for God's sake, don't do it. No, I'm just kidding. But it was, you know, some of the but there was a lot of surprises right so i mentioned one earlier about you know how difficult marketing is and how hard it is to get

Starting point is 00:44:09 noticed and um uh yeah and there were a whole number of of surprises things i you know i i've thought i knew or didn't even know i didn't know um and you know so stuff around you know funding and i think a lot of it was also around the kind of um not just about if you're going to do a or didn't even know I didn't know. And, you know, so stuff around, you know, funding. And I think a lot of it was also around the kind of, not just about if you're going to do a startup, but the kind of commercial aspects of, you know, even if we just, you know, you have a, you know, most software developers need to interact with commercials

Starting point is 00:44:41 in some way or another, right? They need to talk to the sales people inside the company they work at or, you know, because, you know, ultimately, who wants to write software that nobody uses, right? So if it's an open source project that you're just looking to get users, you know, you still have, you know, so there is this kind of,

Starting point is 00:45:00 for software to become useful, you have the kind of, yeah, you've got to write the software, but just to kind of, you know, build it and they but just to kind of you know build it and they will come well you know it's not like that and and uh for programmers the kind of this the commercial side of it is often very counterintuitive and just hard to understand so you know i mean it's the classic classic thing that the engineers you know like oh those sales guys are all like all kind of stupid and aggressive and you know, I mean, it's the classic, classic thing that the engineers, you know, like, oh, those sales guys are all kind of stupid and aggressive. And, you know, why do they keep going off doing all these stupid things?

Starting point is 00:45:33 And, you know, I've worked in different size companies over the years as a software developer and certainly have felt that myself um and then it's got a realization that actually the thing is with sales is that you think you have to convince you have to persuade other people to do what you want them to do and that's like that's way harder than persuading the computer to do what you want it to do uh and then and then with marketing it's like it's even worse because you have to persuade these people to do something. And you haven't even met them. You don't even get to talk to them, and yet you have to persuade them to do what you want them to do.

Starting point is 00:46:12 And, you know, like I remember when I was an engineer in other startups, and I'd be the engineer that the sales guys would like because they could, you know, kind of wheel me out to customers and I could do the PowerPoint presentation and do the demo and, you know, not kind of fall to pieces or whatever. And I'd think in a particularly bitter way, but I'd kind of think, you know what, you know, I did all the work there, right? Sales guy, you know, brought me in, shook hands with a few people, introduced me. I presented some slides that i'd written i did a demo that i'd produced i answered all the questions and you know struck up the rapport with people in the room and then at the end you know the sales guy you know shook hands and and and and left and

Starting point is 00:46:57 then you know a few months later the purchase comes in and like sales guy gets all the credit. And, you know, of course, I had absolutely no idea how much work had gone in, had gone on to getting me in the room in the first place. And then how much there was after that to actually turn, turning a like delighted response. People saying, yeah, this is awesome. We really like it. And definitely, you know, want to have further conversations and stuff, turning that into a sale where money changes hands is really hard. Just so many things can go wrong, and getting that sense of momentum and excitement to continue

Starting point is 00:47:40 once the people go back to their desk and they check their emails and they have four urgent emails in their inbox and it's um you know just this whole other world that i think as software developers you're often we're often exposed just slightly to but we don't really see it and i think you know that hopefully that can help i think by understanding the people on the sales and marketing side inside the organizations that we work for better, you know, hopefully we can all work a bit better and maybe, you know, enjoy our work a bit more. Yeah, definitely. So where would people go if they want to go and try out Undo and LiveRecorder and everything? Yeah, so just go to undo.io and a couple of clicks through

Starting point is 00:48:32 that I think are hopefully fairly well signposted and you can get an evaluation and just try it out. And yeah, I'd love to always really encourage feedback from people who've tried it as well. So whether that's, you know, good or bad, you know, or even a bit pernickety, you know, actually, you know, nothing's too pedantic. I think often people kind of don't want to give feedback because they think, yeah, well, that kind of doesn't really matter. But you know what, if you think it could be a bit better in this way or that way, then please do let us know. Okay.

Starting point is 00:49:09 Thank you so much for your time today, Greg. Okay. Yeah. Thanks, guys. It was a pleasure. Thanks for joining us. Thanks so much for listening in as we chat about C++.

Starting point is 00:49:19 I'd love to hear what you think of the podcast. Please let me know if we're discussing the stuff you're interested in. Or if you have a suggestion for a topic, I love to hear about that too you can email all your thoughts to feedback at cppcast.com i'd also appreciate if you like cppcast on facebook and follow cppcast on twitter you can also follow me at rob w irving and jason at left kiss on twitter and of course you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode is provided by podcastthemes.com.

CppCast - Reverse Debugging

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.