CppCast - UndoDB and Live Recorder

Starting point is 00:00:00 This episode of CppCast is sponsored by Undo Software. Debugging C++ is hard, which is why Undo Software's technology has proven to reduce debugging time by up to two-thirds. Memory corruptions, resource leaks, race conditions, and logic errors can now be fixed quickly and easily. So visit undo-software.com to find out how its next-generation debugging technology can help you find and fix your bugs in minutes, not weeks.

Starting point is 00:00:25 Episode 40 of CppCast with guest Dr. Greg Law, recorded January 8th, 2016. In this episode, we talk about tech startups using C++. Then we'll talk to Dr. Greg Law from Undo Software. Greg will tell us about how UndoDB and LiveRecorder will change the way you debug. Welcome to episode 40 of CppCast, the only podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today? Doing great, Rob. How about you? Doing good. I saw on Twitter earlier this week you shared your pictures.

Starting point is 00:01:41 Do you want to mention that real quick? Yeah, so one of the reasons why the podcast has been semi-irregular over the holidays is i got back from a kenyan safari recently and a bunch of travel over the holidays and had a great time yeah the pictures are pretty nice uh do you have like a high zoom on your camera or are you really that close to those animals um we were easily within 25 feet of lions several times. Some lions walked right past the trucks. Um, but I also did have a 30 times zoom on my compact camera. So some of them it's, it's even hard for me to remember on some of the pictures, whether they were close up or whether they were zoomed in a lot, but it's, it's a combination of both. Okay, very cool. Maybe we can even put that Flickr link in the show notes if people want to check it out. Sure. Okay, so at the top of every episode, I'd like to read a piece of feedback.

Starting point is 00:02:31 This week, I got a tweet from Guido, who's saying the CppCast episode with Richard Thompson is by far the best he's listened to so far. A must-hear for every developer, not just C++ developers. That was a while ago but richard thompson was definitely a great guest uh that was when we were talking about exorcism.io which is uh the online kind of testing uh how do you describe it jason that's like um programming training i

Starting point is 00:03:00 guess might be right and it's kind of collaborative like you go through it and then you have someone reviewing your code. It definitely sounded like a really nice tool. I haven't yet actually gone back and tried it out. Neither have I. Hopefully, the shows keep meeting his expectations since that was a while ago. Yeah, absolutely. Well, we'd love to hear your thoughts about the show as well.

Starting point is 00:03:21 You can always reach us on Facebook or Twitter. Email us at feedback at cppcast.com. And don't forget to leave us reviews on iTunes. So joining us today is Dr. Greg Law. He is a co-founder and CEO at Undo Software. He spent nearly 20 years writing systems level code, including novel kernel designs and networking architecture in academia and at a variety of startups. Greg finds it particularly rewarding to turn innovative software technology into real business development. He still gets to write some code, although sadly, most of his coding these days is done on airplanes.

Starting point is 00:03:54 Greg lives in Cambridge, England with his wife and two children. Greg, welcome to the show. Hi, thanks. So, novel kernel designs. Is there anything about that that you can decompose the kernel down to very fine granularity and have different components for like no scheduler memory manager that kind of thing and then have those components actually be protected from each other so that you know if one goes wrong it can't stomp over the other but yet do that in

Starting point is 00:04:43 a way that didn't have very bad performance overheads. So we had got a cool way to play around with the x86 segmentation registers and to allow domain crossing in a very efficient way. And that's kind of started life as my PhD thesis, and so I made this OS called Go, which was short for Gregos, and then subsequently Google stole the name, of course, for the language for that came later.

Starting point is 00:05:21 But some of those ideas kind of live on inside Zen, inside the Zen hypervisor, although unfortunately when the 64-bit x86 cpus came along they kind of nobbled the segmentation stuff so um but yeah they were it was fun days cool very cool so we had a couple news items to talk about before we get into talking more about undo software uh the first one is c++ status at the end of 2015. And Greg, feel free to jump in with any of these. And this is kind of like just an overview of all the C++ news that went on in 2015. I feel like we've probably touched on this at one point or another over the past 10 months, but it's definitely a nice refresher. Was there anything specific you wanted to call

Starting point is 00:06:05 out in here, Jason? I don't recall seeing anything that really jumped out at me, but it is a lot of great information to look forward to what's coming and what has been in recent C++ news. Yeah. I mean, just a couple of highlights, you know, Clang and GCC are now fully C++14 compliant compliant visual studio is making a ton of progress this year with c++ 11 and 14 as well and you know all the work that's been done towards c++ 17 with the different technical specifications and there was all the c++ core guidelines work out of cppcon which is really exciting so i thought the uh the increase in c++ popularity on um on some of the indexes was was quite interesting to see as well and it it kind of um

Starting point is 00:06:54 uh you know with some of these indexes you kind of need to take it with a bit of a pinch of salt right because it's just very hard to really have an accurate view on this stuff but it it it corroborates. So we've got the kind of anecdotal evidence that I've seen over the place as well, where, um, you know,

Starting point is 00:07:10 just a few years ago, people would tell me that, you know, C plus plus is a sort of dying legacy thing. And, uh, uh, you certainly don't hear that anymore.

Starting point is 00:07:19 And increasingly, I was just in one of the very large banks, um, just the other day. And they were saying that, you know, they're kind of mandated that all the code will be written in Java from now on.

Starting point is 00:07:28 This was a few years back, and they've kind of revoked that requirement now, and they're sort of seeing more and more C++. So, yeah, it's good to see it coming back. Absolutely. That was from the Tiob Index, where C++ is currently listed as number three, just below C and Java holding the number one spot. So it looks like C++ is kind of inching up more and more against C, which makes sense to me.

Starting point is 00:07:57 Makes sense to me, too. And if you combine the C and C++ together, then I know actually i think you get it gets the number one spot quite comfortably right yeah if you so java's at 21.4 c is at 16 and c plus plus is at almost seven so yeah together they easily trump java cool uh kind of moving off of that there was an interesting article on medium about starting a tech startup with C++. And it was just an interesting read where this guy, I think he's the CTO for some new company, and they're working on a web service, and he decided it makes sense to write it in C++. And he had a lot of colleagues in the industry tell him,

Starting point is 00:08:40 oh, that doesn't make any sense. You should be writing it in Ruby or Python, blah, blah, blah. And he just kind of went over his reasons why he thinks C++ makes sense. And it absolutely does. I mean, look at Facebook with, you know, the amount of server time that they save by using C++ instead of one of those dynamic languages. Yeah. And if one of the notes at the bottom here is this guy points out that his c++ example uh code is 40 times faster than the python equivalent yeah that's huge it's huge and it you know it's a huge cost savings for his new company right yeah i guess the thing with um and this is kind of to put my um my my ce CEO hat on rather than the dev hat, when you're starting a company,

Starting point is 00:09:29 the economies of the efficiencies kind of come out differently. So server times, if you're starting up, it's not very expensive because you don't have that many users to begin with, right? And I think particularly with kind with web services type stuff, it was impressive to see that 40x improvement. I read that article. And so it totally explains why the likes of Google do that, because they have millions of servers,

Starting point is 00:10:00 and if you can improve, and it doesn't need to be anything like 40x, if you can improve by 10 doesn't need to be anything like 40x, if you can improve by 10% the amount of transactions per server you can handle and you scale that across millions of servers, across data centers across the world, then it makes sense to pay people the money

Starting point is 00:10:15 to provide that efficient code. If you're able to just run on one or two EC2 instances, because it usually, and based on the premise that you can write the same code more quickly in something like Python or Ruby,

Starting point is 00:10:34 then it kind of makes sense to just, because compute time is cheap and programmer time is very expensive. But I guess he's planning for success, and, you know, hopefully he'll have lots of users, and he'll get a return on that investment nice and quickly. And you kind of led into something that we haven't talked about yet on this show, I believe, some of these Facebook libraries that he mentions.

Starting point is 00:11:00 Folly, yeah. Yeah, Folly, and some of the higher-level HTTP servers and stuff for C++ that can make these things easier and more succinct in C++ than they typically are thought to be. Right. So this next article is a little sad. It comes from Scott Meyer's personal blog. And the headline is good to go. And it's just talking about how he has made the decision to effectively retire from the world of active involvement in C++.

Starting point is 00:11:32 I'm really glad we were able to have Scott on the show when we did. And I'm at the same time really excited to see what he decides to do next with his time. Jason, was there anything you wanted to add here? No. I'm also interested to see if he still comments from time to time on the direction of C++ or whatever else he goes on to do. And I believe, it's just interesting to note,

Starting point is 00:11:56 I think you said this is his 25th anniversary of involvement in C++? Yeah. Wow. 25 years of guiding all of us in how to be better c++ developers basically uh so hopefully scott's gonna still listen to the show i know when he came on he told us he was a listener so uh we wish you the best scott and uh look forward to seeing what you do next um and then this last one i was kind of sorry i just wanted to jump in if that's okay but um

Starting point is 00:12:22 yeah it's kind of i was also sad to see that. I remember reading his books, you know, way back. Yeah, gosh, well, maybe not quite getting on for 25 years ago, 20 years ago, certainly. And, you know, a lot of stuff being explained very clearly, and it was very interesting. But I wonder whether it's just a general sort of comment on C++. And as you said in the intro, you know, I don't get to spend nearly as much time on the tech as I would like to these days. C++ is so fast-moving and so big that there is an issue, I think, that if you're not able to dedicate all of your professional life to keeping up to date and uh you know really understanding it all then um it's kind of hard right so i i don't know what the answer is here and i i understand you know why

Starting point is 00:13:14 it's as big as it is and all that stuff is certainly very powerful um but you know for people like me who spend a kind than 50% of their time developing, it makes it hard. Well, Scott kind of started to try to address that in one of his most recent articles, suggesting that it's time to start deprecating some of the more dangerous and least often used features of C++ and to try to actually simplify the language down to what it needs to be.

Starting point is 00:13:45 See if that goes anywhere. Yeah. No, that, although it still has the issue of the, of the fast movement, you know, like the new version coming out every three years and,

Starting point is 00:13:55 and, you know, a lot to sort of keep on top of and sort of strikes me that, that C++ is lesser programming language, more a way of life. Great. So this last thing I wanted to bring up was from C++ Now conference

Starting point is 00:14:10 are putting out their 2016 call for submissions. And there's actually a really short deadline on this one. The submissions are due January 29th, which is only three weeks from now. And then proposal decisions will be sent out February 22nd. Jason, are you planning on trying to go for this one again? Yeah, I'm going to make a submission again. I didn't mention on the show here that my submission to ACCU was denied, so I will not be going to that conference.

Starting point is 00:14:40 But yeah, I'm going to see if I can go to Aspen again. It's a great conference and a wonderful venue. Now, Greg, I know... to see if I can go to Aspen again. It's a great conference and a wonderful venue. Now, Greg, I know I will be at the ACCU. I did it first at the CPPCon, but a talk on the GDB and tips and tricks with that. Because having sort of slightly complained that C++ has got rather big, GDB has got pretty big as well over the years. And that was a great talk, by the way. Oh, thanks.

Starting point is 00:15:08 I'm sorry. Go ahead. I didn't mean to interrupt. No, I just said thanks. It was just nice to have the compliment, so thank you. Do you have any plans to go to C++ now or CppCon again, Greg? CppCon again, I hope to if I can. C++ now, I don't think I'm going to get there this year.

Starting point is 00:15:31 It's just a little bit too niche. My co-founder, Julian, went last year. And I think that definitely comes into the, you know, kind of if it's your way of life, then it makes a lot of sense. But if it's just sort of something that's related to something you do, it's a little bit too intensive, I think. That's probably fair. Yeah, I think, Jason, I think you're going to have to go by yourself again to C++ now to represent this show. I don't think I'm going to be able to make it this year, but hopefully CppCon I'll get to.

Starting point is 00:15:58 Very good. Okay, so Greg, let's start talking about Undo Software and UndoDB. Could you give us an overview of UndoDB to get us started? Yeah. So, at least in its original incarnation, it looks just like GDB. Actually, let me step back on that a bit. The fundamental technology is a way to record the execution of a program so that you can go back and see what it really did.

Starting point is 00:16:28 And so we wanted to, but we didn't want to replace developers' existing environments. So we kind of made it debugger agnostic. And so you get different debuggers that act as the front end. So GDB is the one that you would use by default. And most of our direct customers who come to us, that's what

Starting point is 00:16:56 they use because that's what they know. And then you use GDB within whatever environment you would usually use it. Some people just use it raw at the command line but some people will use it within Eclipse or from within Emacs or however like that. But we have other debugger front ends as well. So there's actually the ARM DS5, stands for Development Studio 5.

Starting point is 00:17:19 That's the development studio environment from ARM, and that has a feature that their marketing guys call Application Rewind, but that's us underneath, right? And that's DS5 interfacing down onto us. And likewise, you get the TotalView debugger from RogueWave, which is kind of aimed at HPC kind of stuff. That has a feature that their marketing guys

Starting point is 00:17:41 call Replay Engine, and again, that's us underneath. And we'll have some others coming out this year. And really what it's about is giving these debuggers the ability to step and run the program backwards as well as forwards. So we all know that increasingly, well, I don't know if it's increasingly,

Starting point is 00:18:04 but I think always developers spend a lot of time finding and fixing bugs there's a quote i'd like to use from a guy called morris wilkes who um arguably is the the first person to be a professional programmer on a general purpose computer because he was the first there's sort of various claims that what was the first computer and different sort of universities claim it but the one at of what was the first computer, and different universities claim it, but the one at Cambridge was the first one to be used to solve real problems and actually do real work, so run programs other than just proving that this computer thing worked. he has a quote along the lines of he remembers with um great clarity the moment when uh the realization came over him that a good part of the rest of my life was going to be spent finding errors in my own program and and that that that quote kind of makes me laugh because i i remember the same like when i was i don't know 15 or so and just playing around with programming and it

Starting point is 00:19:01 was like you know yeah this is um uh it, it's kind of hard to get right. And I think whatever we do with languages and with smarter ways to write code, it just encourages us to be smarter still and push the boundaries. And so you're just always going to spend a lot of your time trying to make it work. And so we knew,

Starting point is 00:19:26 everyone knows that it's difficult because you're trying to think, how did that happen? Why this variable contains the value that it does? It should be impossible. How can that be? And if you're lucky, the bug itself is very close to where it goes wrong. So, you know,

Starting point is 00:19:41 you immediately assign null to a pointer and dereference it and you get a core dump there and then. But all too often, the problem itself is some way in the history, back in history. So, you know, we knew, you could see very clearly

Starting point is 00:19:56 that if you could just step backwards or if you could do things like put a watch point or data break point on that bad value and run back to the line of code where that was changed, then that would be a very powerful way to get to the bottom of some of these very difficult issues.

Starting point is 00:20:12 So, yeah, so that was sort of UndoDB and the genesis of that. My co-founder and I, my co-founder, a guy called Julian Smith, we started over 10 years ago now and released the first version in 2006. It was very clunky, very slow, real kind of version. We called it 1.0, but really it was 0.1. We've developed it over the years. For quite a long time with that, actually, we just did it in our spare time, trying to get enough income coming in so that we could put food on the table. It takes a long time when you're working evenings and weekends on something. It just trying to get enough income coming in so that we could put food on the table but um uh and so it

Starting point is 00:20:46 takes a long time right when you're working evenings and weekends on something it just took many many years but um yeah eventually kind of three or four years ago now uh we had enough that i could quit my day job and go full-time year later jules did the same a couple of years ago we were still five people in a big shed in my back garden. And we've sort of gone from there. We've taken a few investment rounds. We're still kind of small, right? We're 18 people.

Starting point is 00:21:14 But, yeah, we're having a lot of fun. Awesome. So can we go into a little bit more about what platforms are supported by UndoDDB? This is Linux and Android primarily? Right, exactly. So it's just very tightly tied in the implementation to the Linux kernel. It's very much at the kernel level. So we don't have any dependencies on Glibc or anything in user space. So officially, you running, so I think, so officially, you would call it GNU Linux, right?

Starting point is 00:21:48 That's the, certainly Richard Stallman's very keen that it's called GNU Linux because all of the user space you see is GNU. But actually for us, it really is Linux. It really is the kernel that we care about. So it just translates very easily over to Android,

Starting point is 00:22:00 which is the Linux kernel, but a different user space. So do you support anything like Raspberry Pi or any other small Linux platforms, or do you care? Yeah, no, we do care. It's very implementation tied to the kernel and also to the CPU architecture. So we do a JIT binary retranslation of the code as it runs in order to be able to do what we do,

Starting point is 00:22:27 in order to be able to pull out all of those non-deterministic inputs. We know all about the x86 instruction set, and we have a full decoder for that, and we do JIT transformation on that, and we do the same on ARM. But beyond that, we don't care. So we care very much what the OS kernel kernel user space ABI is we care very much what the instruction set architecture is but beyond that

Starting point is 00:22:54 whether it's running on a Raspberry Pi or a node in a big cray supercomputer somewhere it's all the same to us I wanted to interrupt this discussion for just a moment to bring you a word from our sponsors. Do you hate it when your customer finds a bug in your code? Tired of spending ages trying to reproduce intermittent failures? At Undo Software, they know that debugging C++ can be hard. That is why their next generation debugging technology for Linux and Android is designed for C++ users

Starting point is 00:23:24 and is proven to reduce your debugging time by up to two-thirds. Embed live recorder within your program so that you have a complete record of your program's execution in production. Activate it in your test suite to solve intermittently failing tests and replay it later for offline analysis. Memory corruptions, resource leaks, race conditions, and hard-to-find bugs can now be solved quickly and easily. Visit undo-software.com for more information and start fixing bugs in minutes, not weeks. So, several tools that I've learned about recently for debugging and performance monitoring rely on being able to access the CPU hardware counters and therefore cannot run on virtual box.

Starting point is 00:24:06 Does under DB have any limitations like that? No, no. So we're, we're entirely a software, uh, implementation, um,

Starting point is 00:24:15 which, uh, uh, has pros and cons, right? So, uh, the main con being it's just harder,

Starting point is 00:24:22 it's more work. Um, but, but the, the main advantage being it's just harder. It's more work. But the main advantage being it gives you more flexibility and you're less tied to these things. So it works just fine in VirtualBox or on an AWS instance or something like that. We care only that it's a Linux 2.6 kernel or later, which these days is just everything and um i think like an arm v5 architecture or later or on x36 i think it needs to be like the you know

Starting point is 00:24:58 pentium or or newer so you know it's pretty pretty generic um and uh and and that gives, yeah, just makes it, we knew from the beginning when we did this that if people are going to be able to really use this, it really needs just to work as far as possible anyway, out of the box. And because when customers come to us for the first time, they're usually in crisis mode, right? It's hard when things are going well and

Starting point is 00:25:26 everyone's happy it's kind of hard to get them to pay any attention to you but but when things really aren't going well then then then they'll then when they've tried everything else then they'll talk to us and uh at that point you know if then oh but there's a load of caveats and you need to change your code or you need to do whatever it is, install a kernel module, make sure you're doing this, then they're already in a crisis and they don't really have the time to do that. And it does allow us to do some neat stuff as well. When you get into the details

Starting point is 00:26:04 of it, there's some kind of nasty things. Like, for example, some of the instructions are non-deterministic. So most instructions, if you add two numbers together and you have the same register state and the same memory state when you do that, then you'd better get the same answer

Starting point is 00:26:20 each time you do that. But other instructions, read the timestamp counter is one but there are some others um we'll give you a different we'll give you a different result even when you do them in the same state so being able to you know translate those in the way that we need to is uh gives us some power and and critically i think probably most importantly allows us to deal with applications that have shared memory right and memory shared between another process or or memory shared uh with a with a device um i mentioned that we run on some of the cray and other supercomputers and um one with one of the very common models there is they use RDMA, remote direct memory access. And what that means is one node on the network will be able to write directly

Starting point is 00:27:09 into the address space of another node. And you can imagine for a tool like ours, that can be problematic. But because we do that binary retranslation, we can actually intercept those changes and handle that case correctly. So, you know know maybe this is getting off in the weeds a little bit but in that particular case do you need to be running on all of the nodes that are interacting or just an individual node would work yeah no just an individual node will work i mean happily uh there is enough kind of communication with the os

Starting point is 00:27:40 beforehand to kind of set that up because you know obviously if you just allowed any process to write into arbitrarily into another process's address space then across the network then you know that would be kind of anarchy so there is a kind of handshake where they say right well this is the memory region where I'm going to I'm going to use to communicate with and then there's a kind of you know authentication thing that allows that to happen. So we can see that happen, and then we can know, all right, this is a memory region that we need to treat specially. Okay. So to bring it back a bit, you talked a bit about how UndoDB kind of just works. What is the process like to switch over from GDB to UndoDB for a first-time user?

Starting point is 00:28:23 Simply, rather than using GDB, use UndoDB. It's that simple? It's that simple. And actually, UndoDB itself is a Python wrapper that, like most Python programs, has sort of grown over the years

Starting point is 00:28:39 from a hundred-odd lines to, I don't know how many, tens of thousands now. And then, yeah, that will invoke GDB in the right way hundred odd lines to i don't know how many uh tens of thousands now but um uh and and then yeah that will invoke gdb in in the right way so that actually you're using undo db underneath um so yeah we we make it you know completely transparent or if you're using say eclipse when we're in there's a if you look in the right place there's a dialog box in eclipse that says what the path to gdb is and it's expecting just a path to

Starting point is 00:29:05 GDB. But if you replace that with the undo DB command, then Eclipse will use undo DB and be none the wiser. And of course, it is actually in that, you know, it is using GDB in that case, right? And it's just with a kind of backend. So we fit between the debugger, be that GDB, DS5, TotalView, Trace32 now, whatever. We fit between the debugger and the program that's being debugged. Okay. So in UndoDB, you can step forward while debugging or step backward. You can run backward with the program. Is that correct?

Starting point is 00:29:43 That's right. Yeah. So, and you've mentioned eclipse several times what ides do you have that have uh that are have the integration to support the both go forward and go backward kind of functionality most most of them do already um so because actually gdb has had this capability for some time. It's just, it's really, really slow. So the vast majority of the front ends know how to take advantage of that. But, and if they don't, then you can at least,

Starting point is 00:30:22 as long as you can at least type at the command line you can at least then hit the you know you can just type reverse next or whatever rather than than hitting the button but you can yeah you can usually on on most of them you can find the the buttons we've actually we've got patches for um a couple of the lesser used one as well because just because customers like them or we like them so there's the KDBG which is the KDE graphical debugger and we've got some we patch that for example to have those reverse buttons as well okay but but you can you can use standard Eclipse Emacs yeah what else I don't think yeah I're they're the main front ends that customers

Starting point is 00:31:08 use actually eclipse and eclipse and emacs but they have the they've had the backwards buttons there for some time okay um you mentioned how you know the project is very closely tied to linux kernel are there any plans to support anything outside of Linux? Have you looked at that at all? Yeah, plan is a strong word. But yeah, I mean, I think it's kind of interesting because it just depends on, the two things it depends on, of course, is what's the, well, it's all about the return on investment, right?

Starting point is 00:31:43 So what's the customer demand for it and how hard would it be for us to do and and as a a young company you know trying to trying to get ourselves established and get our place in the world it's it's very frustrating but you're forced to be quite short-termist right and i don't know you can argue about whether that's a good thing or not because actually if you have these long-termist right and i don't know you can argue about whether that's a good thing or not because actually if you have these long-term goals you often find that you've based them on very on at least some of your assumptions were false and then you it's too big an investment to get there to learn but anyway for the wrong rights or wrongs of it truth is you know we have to we have to be doing things that will give return meaningful revenue to us in you know the next one two three quarters um so it makes it it makes it very difficult to embark on on some of these larger

Starting point is 00:32:36 things like porting across to another platform so for example when we ported across to arm originally we were x86 only when we when we put it across the arm architecture we did that very closely with um the company and they actually you know funded that effort and we well we did that because we knew we were going to do the integration into ds5 so it's all kind of part of that uh it would have been you know almost it would have been basically impossible for us just to decide we're going to go off and do an ARM version now and then see if that sells. Okay. Should we talk a bit about your other product called LiveRecorder? Does that kind of take UndoDB and goes a little further with it?

Starting point is 00:33:15 Right, right. So it takes kind of the core recording technology, but rather than being an interactive tool, it is a library that you can link against and then that gives and it's a very straightforward c api to do things like enable recording stop the recording save the recording to a file and some sort of configuration stuff as well so that then you know if you link against this library it it's dormant by default, but if you call in to turn on the recording or whatever, then that means the program can record itself effectively.

Starting point is 00:33:54 And so you can then just deploy that program wherever and particularly useful for independent software vendors who are shipping their code where they don't have any kind of control or visibility into it. So it might be a classic software vendor sending their code to somebody else, or it might be somebody who's kind of deploying code up into the cloud, but often when your code's running up in the cloud,

Starting point is 00:34:17 it's quite difficult to get that visibility. And so when weirdness is afoot, the program can, under whatever circumstances it decides, start the recording and then save that recording to a file. And then you can take that recording file and load it up somewhere else, another time,

Starting point is 00:34:36 another place, and you load it up into UndoDB. So now you get all of that really cool stuff of reversible debugging and that ability to go back and forth, wind the tape back and forth to see what really happened. In addition, you kind of solve that reproducibility problem. So you're sort of debugging offline an exact, precise copy of the failure as it occurred in in production and so that's the that's the what we call the live recorder product so the idea is you're recording those failures

Starting point is 00:35:15 as they as they really happen so what kind of performance overhead does live recorder have when you turn it on so it it does have a significant, well, certainly a very measurable performance head. So things generally run, I mean, it does depend an awful lot on the program. Sometimes it's not really noticeable. Sometimes it'll run about kind of half speed. And, you know, sometimes it can be a little bit worse than that. So most of our customers don't turn on live recorder all of the time. So actually in in sort of early days we it was it was this internal engineering name for some time was

Starting point is 00:35:51 flight recorder. We decided not to call it flight recorder for a couple of reasons. One well actually there's some Java stuff called flight recorder and we didn't want to get confused with that but but also it's kind of a misleading because flight recorders are always on and only record sort of certain interesting things right um you know this the flight recorder is not going to tell you what the passenger in flight 43c had for lunch which when a plane crashes probably doesn't matter but when it's software it might it might be the problem right and and so you know so this just really records everything um uh or at least and that records i say records sufficiently for

Starting point is 00:36:33 you to be able to reconstruct everything so we don't we only record the non-deterministic inputs but um uh where was i going yeah and and but and but, but because of that performance overhead, usually you don't have it always on. We do have actually customers who put it always on because they have a, if they've got a low CPU overhead application that is exhibiting, we've got one where the customer sees it crash

Starting point is 00:36:59 like every, like just literally a handful of times a year and they just,'s really really hard to diagnose it's not very high percentage cpu information so they've they've quite recently deployed with that enabled all the time let's wait for it to crash and then we can find out what's happened but most of the time you just you turn it on when it's going wrong so either you know maybe it's a node in the cloud and you know that's sporadically these problems so we'll just turn on one in a hundred instances or something uh or um you've got a customer who's complaining because this isn't working in their environment of course we've all been there customer says well you know i do this

Starting point is 00:37:35 and it breaks and you try and do this and you say well it works for me um so now you can say well okay turn on the recording and how that happens is left as an exercise to our customers. So maybe they have a menu config option inside, or maybe they just can tunnel in and do it remotely, whatever. And to get that recording on and say, okay, now do that again. And now send us a recording file back. I'm thinking about the way you've described how this is implemented, that you are recording all of the non-deterministic inputs and doing some form of jitting of the application code. And I'm thinking there's other things that you could be doing with this data

Starting point is 00:38:12 other than debugging your program, like memory profiling or CPU profiling. Are there other applications or are any of your customers abusing the product in ways you didn't intend for them to? Yeah, absolutely. customers abusing the product in ways you didn't intend for them to or yeah that's it yeah absolutely so um uh it actually opens up some really interesting possibilities ones that we didn't see in the early days in fact i i remember back right at the very beginning when i was kind of thinking

Starting point is 00:38:39 you know should we do this and i i spoke to a few people and found some of the the smartest people that I knew and ran this by them as an idea. Do you think this idea is crazy or do you think it might just work? And there's one guy I worked with years ago, really, really smart guy, and I ran it by him. He said, yeah, that sounds like it would be great. He said, I think there would be lots of other applications beyond just debugging as well

Starting point is 00:39:03 if you can wind time back. And I said, oh, wow, gee, Pete, that's really interesting. Can you sort of elaborate on what that would be? And he said, well, I don't know. I can't quite give you any examples, but I'm pretty sure it's there. Thanks. Thanks, Pete. That really helps.

Starting point is 00:39:17 But actually over the years, and particularly with this is with, you know, interaction with customers, I've realized, oh, I see. I see what he means. And so, yeah, we've had – I mean, I can't sort of get into too many details, unfortunately. But we've had in the last sort of six months, I think four of our biggest – all four of the top four biggest customers have come to me and said, Greg we love your stuff one of them said it's like heroin

Starting point is 00:39:52 I think he means it he just needs it all the time now rather than it ruined his life but this is really really great for debugging but do you know what could we use it to do X where X is solve a particular kind of business application problem that they're facing? That there's no way that I could envisage because I don't understand their business, right? And that's as diverse as from EDA companies, running simulators and synthesis tools,

Starting point is 00:40:27 to banks, trading applications, across the board. And they have specific things where just this ability to know what really happened is just fundamentally really powerful. And then to be able to start to do kind of analyses on those recordings and put that together. I think there's a whole heap of really, really interesting directions this can go in. And of course, we come back to what I mentioned just before that we're trying to grow the business here and ultimately that comes down to decisions about what will what can we do that will generate revenue in in the short term and also has a higher the highest probability of generating revenue as well so it kind of limits you know i'd love if we

Starting point is 00:41:18 would have a a research department with you know 100 people in it that would just go off and do all this crazy stuff and and we'd probably find some really really amazing things in there but realistically i think it's going to be a few years before we can actually do that so you talked for a second there about uh how your one customer was comparing it to heroin uh it definitely seems like the type of thing reversible debugging where once you've had it fix some hard to find hard to debug problem for you once you would get really hooked on it um do you think reversible debugging is kind of becoming more of a thing in the industry like more and more developers are using it it's becoming more popular because i definitely see that you know maybe 10 years from

Starting point is 00:42:00 now it's just going to be like the standard that everyone uses reversible debugging all the time yeah i i think so i mean that's why you know that's why uh that's why we started the business right sure we could we could see um that this was if you like an idea whose time has come um it it's just a much smarter way to do things it however it is i must say frustrating how slowly things change and people's habits change and how um you know i remember when we so when we launched version one which as i said it's more like sort of 0.1 but it was a thing and it worked and um i had a friend who a few years ago had had his back when i did os research he'd had his new os posted on slash dot and and the servers the web server had crashed under all the traffic right so um i'd be warned our web posting providers well we're going to launch um this this product and

Starting point is 00:43:02 we're going to put a press release out and you know probably not but just be aware just in case there, just in case there's a kind of, you know, there might be a bit of a spike in traffic. And they were like, yeah, okay, whatever. I've probably heard that before. And we launched, and just like nothing happened, right? I mean, it was real. You just couldn't get anybody to see it or to care. And slowly, slowly over the years, it kind of people one by one by one started using it. Those that use it,

Starting point is 00:43:28 whether that's using Rstaff or other implementations of this, it's completely addictive and they couldn't imagine working any other way. It just takes so long for people to change habits. i think i think still the predominant way people debug is with printf and and i think there's something about something

Starting point is 00:43:53 about i think that you're when you're debugging your mind is already completely full you're already you know that quote from is it b Brian Kernighan, I think, the quote that debugging is twice as hard as writing the code in the first place. So if you make the code as smart as you can, how will you ever debug it? And it's kind of related to that in that, you know, so when you're debugging, you're just like your brain CPU load is at 100%. And so new things, trying to find new ways, even if they will make you much more efficient,

Starting point is 00:44:30 it's just hard for people to do. So it's frustrating how it is. I mean, it should be just like standard thing where it's available. I know it's not available everywhere, but where it's available, it's like it's a constant source of frustration to me that it's not just,

Starting point is 00:44:44 why isn't absolutely everybody using this stuff? But it will take time and habits will change. And you'll always have those dyed-in-the-wall printf debug people. But, yeah, so I think, I mean, I'm not well-known for my patience. So I think probably when you say 10 years from now, sadly, you're probably right right it probably will take 10 years uh jason do you have any other questions you want to ask um no i don't believe i do okay well greg where should people go if they want to find more information to get started with undo db or just find more about you yeah so uh undo

Starting point is 00:45:26 dash software.com um and that's that's the definitely the first place to go and um uh you know everything is is there on the on the products and that's probably the best you know there's not a lot of information on me i don't think there is much of a way to find out about me i think this this this uh podcast has probably been as much exposure as I've had some in the last few years. But yeah, you don't care about me. You care about UndoDB. Okay. Well, thank you so much for your time today, Greg. Thanks a lot. Thanks for joining us. Thanks so much for listening as we chat about C++. I'd love to hear what you think of the podcast. Please let me know if we're discussing the stuff you're interested in,

Starting point is 00:46:09 or if you have a suggestion for a topic, I'd love to hear that also. You can email all your thoughts to feedback at cppcast.com. I'd also appreciate if you can follow CppCast on Twitter, and like CppCast on Facebook. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode is provided by podcastthemes.com.

Your Ad Here

CppCast - UndoDB and Live Recorder

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.