CppCast - UndoDB and Live Recorder
Episode Date: January 8, 2016Rob and Jason are joined by Dr. Greg Law to discuss reverse debugging with Undo Software. Dr Greg Law is co-founder and CEO at Undo Software. He has spent nearly 20 years writing systems-level... code, including novel kernel designs and networking architectures in academia and at a variety of start-ups. Greg finds it particularly rewarding to turn innovative software technology into “real” business development. He still gets to write some code, although sadly most of his coding these days is done on aeroplanes. Greg lives in Cambridge, England with his wife and two children. News C++ Status at the end of 2015 Starting a tech startup with C++ } // good to go C++Now 2016 Call for Submission Dr. Greg Law @gregthelaw Greg Law's posts on Undo Software's Blog Links Undo Software Jason's photos from Kenya
Transcript
Discussion (0)
This episode of CppCast is sponsored by Undo Software.
Debugging C++ is hard, which is why Undo Software's technology
has proven to reduce debugging time by up to two-thirds.
Memory corruptions, resource leaks, race conditions, and logic errors
can now be fixed quickly and easily.
So visit undo-software.com to find out how its next-generation
debugging technology can help you find and fix your bugs
in minutes, not weeks.
Episode 40 of CppCast with guest Dr. Greg Law, recorded January 8th, 2016.
In this episode, we talk about tech startups using C++.
Then we'll talk to Dr. Greg Law from Undo Software.
Greg will tell us about how UndoDB and LiveRecorder will change the way you debug. Welcome to episode 40 of CppCast, the only podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
Doing great, Rob. How about you?
Doing good. I saw on Twitter earlier this week you shared your pictures.
Do you want to mention that real quick?
Yeah, so one of the reasons why the podcast has been semi-irregular over the holidays is i got back from a kenyan safari recently and a bunch
of travel over the holidays and had a great time yeah the pictures are pretty nice uh do you have
like a high zoom on your camera or are you really that close to those animals um we were easily within 25 feet of lions several times. Some lions walked right past the trucks. Um, but I also did have a 30 times zoom on my compact camera. So some of them it's, it's even hard for me to remember on some of the pictures, whether they were close up or whether they were zoomed in a lot, but it's, it's a combination of both. Okay, very cool. Maybe we can even put that Flickr link in the show notes
if people want to check it out.
Sure.
Okay, so at the top of every episode,
I'd like to read a piece of feedback.
This week, I got a tweet from Guido,
who's saying the CppCast episode with Richard Thompson
is by far the best he's listened to so far.
A must-hear for every developer,
not just C++ developers.
That was a while ago but richard
thompson was definitely a great guest uh that was when we were talking about exorcism.io which is uh
the online kind of testing uh how do you describe it jason that's like um programming training i
guess might be right and it's kind of collaborative like you go through it and then you have someone
reviewing your code.
It definitely sounded like a really nice tool.
I haven't yet actually gone back and tried it out.
Neither have I.
Hopefully, the shows keep meeting his expectations since that was a while ago.
Yeah, absolutely.
Well, we'd love to hear your thoughts about the show as well.
You can always reach us on Facebook or Twitter.
Email us at feedback at cppcast.com. And don't forget to leave us reviews on iTunes.
So joining us today is Dr. Greg Law. He is a co-founder and CEO at Undo Software. He spent
nearly 20 years writing systems level code, including novel kernel designs and networking
architecture in academia and at a variety of startups. Greg finds it particularly rewarding to turn innovative software technology
into real business development.
He still gets to write some code, although sadly,
most of his coding these days is done on airplanes.
Greg lives in Cambridge, England with his wife and two children.
Greg, welcome to the show.
Hi, thanks.
So, novel kernel designs.
Is there anything about that that you can decompose the kernel down to very fine granularity and have
different components for like no scheduler memory manager that kind of
thing and then have those components actually be protected from each other so
that you know if one goes wrong it can't stomp over the other but yet do that in
a way that didn't have very bad performance overheads.
So we had got a cool way to play around with the x86 segmentation registers
and to allow domain crossing in a very efficient way.
And that's kind of started life as my PhD thesis,
and so I made this OS called Go,
which was short for Gregos,
and then subsequently Google stole the name, of course,
for the language for that came later.
But some of those ideas kind of live on inside Zen,
inside the Zen hypervisor,
although unfortunately when the 64-bit x86 cpus came along they kind of nobbled the segmentation stuff so um but yeah they were it was fun days cool
very cool so we had a couple news items to talk about before we get into talking more about undo
software uh the first one is c++ status at the end of 2015. And Greg, feel free to jump in with
any of these. And this is kind of like just an overview of all the C++ news that went on in 2015.
I feel like we've probably touched on this at one point or another over the past 10 months,
but it's definitely a nice refresher. Was there anything specific you wanted to call
out in here, Jason? I don't recall seeing anything that really jumped out at me,
but it is a lot of great information to look forward to what's coming and what has been
in recent C++ news. Yeah. I mean, just a couple of highlights, you know, Clang and GCC are now
fully C++14 compliant compliant visual studio is making a
ton of progress this year with c++ 11 and 14 as well and you know all the work that's been done
towards c++ 17 with the different technical specifications and there was all the c++ core
guidelines work out of cppcon which is really exciting so i thought the uh the increase in c++ popularity
on um on some of the indexes was was quite interesting to see as well and it it kind of um
uh you know with some of these indexes you kind of need to take it with a bit of a pinch of salt
right because it's just very hard to really have an accurate view on this stuff but it it it
corroborates.
So we've got the kind of anecdotal evidence that I've seen over the place as
well,
where,
um,
you know,
just a few years ago,
people would tell me that,
you know,
C plus plus is a sort of dying legacy thing.
And,
uh,
uh,
you certainly don't hear that anymore.
And increasingly,
I was just in one of the very large banks,
um,
just the other day.
And they were saying that,
you know,
they're kind of mandated
that all the code will be written in Java from now on.
This was a few years back,
and they've kind of revoked that requirement now,
and they're sort of seeing more and more C++.
So, yeah, it's good to see it coming back.
Absolutely.
That was from the Tiob Index,
where C++ is currently listed as number three, just below C and Java holding the number one spot.
So it looks like C++ is kind of inching up more and more against C, which makes sense to me.
Makes sense to me, too.
And if you combine the C and C++ together, then I know actually i think you get it gets the number one spot quite comfortably right yeah if you so java's at 21.4 c is at 16 and c plus plus
is at almost seven so yeah together they easily trump java cool uh kind of moving off of that
there was an interesting article on medium about starting a tech startup with C++. And it was just an interesting read where this guy,
I think he's the CTO for some new company,
and they're working on a web service,
and he decided it makes sense to write it in C++.
And he had a lot of colleagues in the industry tell him,
oh, that doesn't make any sense.
You should be writing it in Ruby or Python, blah, blah, blah.
And he just kind of went over his reasons why he thinks C++ makes sense.
And it absolutely does. I mean, look at Facebook with, you know, the amount of server time that they save by using C++ instead of one of those dynamic languages. Yeah. And if one of the notes at the bottom here is this guy points out that his c++
example uh code is 40 times faster than the python equivalent yeah that's huge it's huge and it you
know it's a huge cost savings for his new company right yeah i guess the thing with um and this is
kind of to put my um my my ce CEO hat on rather than the dev hat,
when you're starting a company,
the economies of the efficiencies kind of come out differently.
So server times, if you're starting up, it's not very expensive
because you don't have that many users to begin with, right?
And I think particularly with kind with web services type stuff,
it was impressive to see that 40x improvement.
I read that article.
And so it totally explains why the likes of Google do that,
because they have millions of servers,
and if you can improve,
and it doesn't need to be anything like 40x,
if you can improve by 10 doesn't need to be anything like 40x, if you can improve by 10%
the amount of
transactions per server you can handle and you
scale that across millions of servers, across
data centers across the world, then it makes
sense to pay people the money
to provide that efficient code.
If you're able
to just run on one or two EC2
instances,
because it usually,
and based on the premise
that you can write the same code more quickly
in something like Python or Ruby,
then it kind of makes sense to just,
because compute time is cheap
and programmer time is very expensive.
But I guess he's planning for success,
and, you know, hopefully he'll have lots of users,
and he'll get a return on that investment nice and quickly.
And you kind of led into something that we haven't talked about yet on this show,
I believe, some of these Facebook libraries that he mentions.
Folly, yeah.
Yeah, Folly, and some of the higher-level HTTP servers and stuff for C++
that can make these things easier and more succinct in C++ than they typically are thought to be.
Right.
So this next article is a little sad.
It comes from Scott Meyer's personal blog.
And the headline is good to go. And it's just talking about how he has made the decision to effectively retire
from the world of active involvement in C++.
I'm really glad we were able to have Scott on the show when we did.
And I'm at the same time really excited to see what he decides to do next with his time.
Jason, was there anything you wanted to add here? No.
I'm also interested to see
if he still comments
from time to time on the direction of C++
or whatever else he goes on to do.
And I believe, it's just interesting to note,
I think you said this is his 25th
anniversary of involvement in C++?
Yeah. Wow.
25 years of
guiding all of us in how to be better c++ developers basically
uh so hopefully scott's gonna still listen to the show i know when he came on he told us he
was a listener so uh we wish you the best scott and uh look forward to seeing what you do next
um and then this last one i was kind of sorry i just wanted to jump in if that's okay but um
yeah it's kind of i was also sad to see that.
I remember reading his books, you know, way back.
Yeah, gosh, well, maybe not quite getting on for 25 years ago, 20 years ago, certainly.
And, you know, a lot of stuff being explained very clearly, and it was very interesting.
But I wonder whether it's just a general sort of comment on C++.
And as you said in the intro, you know, I don't get to spend nearly as much time on the tech as I would like to these days.
C++ is so fast-moving and so big that there is an issue, I think, that if you're not able to dedicate all of your professional life to keeping up to date and uh you know really understanding it all then
um it's kind of hard right so i i don't know what the answer is here and i i understand you know why
it's as big as it is and all that stuff is certainly very powerful um but you know for
people like me who spend a kind than 50% of their time developing,
it makes it hard.
Well, Scott kind of started to try to address that
in one of his most recent articles,
suggesting that it's time to start deprecating
some of the more dangerous and least often used features of C++
and to try to actually simplify the language down to what it needs to be.
See if that goes anywhere.
Yeah.
No,
that,
although it still has the issue of the,
of the fast movement,
you know,
like the new version coming out every three years and,
and,
you know,
a lot to sort of keep on top of and sort of strikes me that,
that C++ is lesser programming language,
more a way of life.
Great.
So this last thing I wanted to bring up
was from C++ Now conference
are putting out their 2016 call for submissions.
And there's actually a really short deadline on this one.
The submissions are due January 29th,
which is only three weeks from now.
And then proposal decisions will be sent out February 22nd.
Jason, are you planning on trying to go for this one again?
Yeah, I'm going to make a submission again.
I didn't mention on the show here that my submission to ACCU was denied, so I will not be going to that conference.
But yeah, I'm going to see if I can go to Aspen again.
It's a great conference and a wonderful venue.
Now, Greg, I know... to see if I can go to Aspen again. It's a great conference and a wonderful venue.
Now, Greg, I know I will be at the ACCU.
I did it first at the CPPCon, but a talk on the GDB and tips and tricks with that.
Because having sort of slightly complained that C++ has got rather big, GDB has got pretty big as well over the years.
And that was a great talk, by the way.
Oh, thanks.
I'm sorry.
Go ahead.
I didn't mean to interrupt.
No, I just said thanks.
It was just nice to have the compliment, so thank you.
Do you have any plans to go to C++ now or CppCon again, Greg?
CppCon again, I hope to if I can.
C++ now, I don't think I'm going to get there this year.
It's just a little bit too niche.
My co-founder, Julian, went last year.
And I think that definitely comes into the, you know, kind of if it's your way of life, then it makes a lot of sense. But if it's just sort of something that's related to something you do, it's a little bit too intensive, I think.
That's probably fair.
Yeah, I think, Jason, I think you're going to have to go by yourself again
to C++ now to represent this show.
I don't think I'm going to be able to make it this year,
but hopefully CppCon I'll get to.
Very good.
Okay, so Greg, let's start talking about Undo Software and UndoDB.
Could you give us an overview of UndoDB to get us started?
Yeah.
So, at least in its original incarnation, it looks just like GDB.
Actually, let me step back on that a bit.
The fundamental technology is a way to record the execution of a program
so that you can go back and see what it really did.
And so we wanted to, but we didn't want to replace developers' existing environments.
So we kind of made it debugger agnostic.
And so you get different debuggers that act as the front end.
So GDB is the one that you
would use
by default.
And most of our direct customers
who come to us, that's what
they use because that's what they know.
And then you use GDB within
whatever environment you would usually use it.
Some people just use it raw at the command line
but some people will use it within Eclipse or from within Emacs
or however like that.
But we have other debugger front ends as well.
So there's actually the ARM DS5, stands for Development Studio 5.
That's the development studio environment from ARM,
and that has a feature that their marketing guys
call Application Rewind,
but that's us underneath, right?
And that's DS5 interfacing down onto us.
And likewise, you get the TotalView debugger
from RogueWave, which is kind of aimed at HPC kind of stuff.
That has a feature that their marketing guys
call Replay Engine,
and again, that's us underneath.
And we'll have some others coming out this year.
And really what it's about is giving these debuggers
the ability to step and run the program backwards
as well as forwards.
So we all know that increasingly,
well, I don't know if it's increasingly,
but I think always developers
spend a lot of time finding and fixing bugs there's a quote i'd like to use from a guy called
morris wilkes who um arguably is the the first person to be a professional programmer on a
general purpose computer because he was the first there's sort of various claims that what was the
first computer and different sort of universities claim it but the one at of what was the first computer, and different universities claim it, but the one at Cambridge was the first one to be used to solve real problems and actually do real work, so run programs other than just proving that this computer thing worked. he has a quote along the lines of he remembers with um great clarity the moment when uh the
realization came over him that a good part of the rest of my life was going to be spent finding
errors in my own program and and that that that quote kind of makes me laugh because i i remember
the same like when i was i don't know 15 or so and just playing around with programming and it
was like you know yeah this is um uh it, it's kind of hard to get right.
And I think whatever we do with languages
and with smarter ways to write code,
it just encourages us to be smarter still
and push the boundaries.
And so you're just always going to spend a lot of your time
trying to make it work.
And so we knew,
everyone knows that it's difficult because you're trying to
think, how did that happen?
Why this variable contains
the value that it does? It should be impossible.
How can that be?
And if
you're lucky, the bug itself is
very close to where it goes wrong. So, you know,
you immediately assign null to
a pointer and dereference it
and you get a core dump there and then.
But all too often,
the problem itself is some way
in the history, back in history.
So, you know, we knew,
you could see very clearly
that if you could just step backwards
or if you could do things
like put a watch point
or data break point on that bad value
and run back to the line of code
where that was changed,
then that would be a very powerful way
to get to the bottom of some of these very difficult issues.
So, yeah, so that was sort of UndoDB and the genesis of that.
My co-founder and I, my co-founder, a guy called Julian Smith,
we started over 10 years ago now
and released the first version in 2006. It was
very clunky, very slow, real kind of version. We called it 1.0, but really it was 0.1. We've
developed it over the years. For quite a long time with that, actually, we just did it in
our spare time, trying to get enough income coming in so that we could put food on the
table. It takes a long time when you're working evenings and weekends on something. It just trying to get enough income coming in so that we could put food on the table but um uh and so it
takes a long time right when you're working evenings and weekends on something it just took
many many years but um yeah eventually kind of three or four years ago now uh we had enough that
i could quit my day job and go full-time year later jules did the same a couple of years ago
we were still five people in a big shed in my back garden.
And we've sort of gone from there.
We've taken a few investment rounds.
We're still kind of small, right?
We're 18 people.
But, yeah, we're having a lot of fun.
Awesome.
So can we go into a little bit more about what platforms are supported by UndoDDB?
This is Linux and Android primarily?
Right, exactly. So it's just very tightly tied in the implementation to the Linux kernel.
It's very much at the kernel level.
So we don't have any dependencies on Glibc or anything in user space. So officially, you running, so I think, so officially,
you would call it GNU Linux, right?
That's the,
certainly Richard Stallman's very keen
that it's called GNU Linux
because all of the user space you see is GNU.
But actually for us,
it really is Linux.
It really is the kernel that we care about.
So it just translates very easily over to Android,
which is the Linux kernel,
but a different user space.
So do you support anything like Raspberry Pi or any other small Linux platforms,
or do you care?
Yeah, no, we do care.
It's very implementation tied to the kernel and also to the CPU architecture.
So we do a JIT binary retranslation of the code as it runs
in order to be able to do what we do,
in order to be able to pull out all of those non-deterministic inputs.
We know all about the x86 instruction set, and we have a full decoder for that,
and we do JIT transformation on that, and we do the same on ARM.
But beyond that, we don't care.
So we care very much what the OS kernel
kernel user space ABI is
we care very much what the instruction set architecture is
but beyond that
whether it's running on a Raspberry Pi
or a node in a big cray supercomputer somewhere
it's all the same to us
I wanted to interrupt this discussion for just a moment to bring you a word from our sponsors.
Do you hate it when your customer finds a bug in your code?
Tired of spending ages trying to reproduce intermittent failures? At Undo Software,
they know that debugging C++ can be hard. That is why their next generation
debugging technology for Linux and Android is designed for C++ users
and is proven to reduce your debugging time by up to two-thirds.
Embed live recorder within your program so that you have a complete record of your program's execution in production.
Activate it in your test suite to solve intermittently failing tests and replay it later for offline analysis.
Memory corruptions, resource leaks, race conditions, and hard-to-find bugs can now be solved quickly and easily.
Visit undo-software.com for more information and start fixing bugs in minutes, not weeks.
So, several tools that I've learned about recently for debugging and performance monitoring
rely on being able to access the CPU hardware counters and therefore cannot
run on virtual box.
Does under DB have any limitations like that?
No,
no.
So we're,
we're entirely a software,
uh,
implementation,
um,
which,
uh,
uh,
has pros and cons,
right?
So,
uh,
the main con being it's just harder,
it's more work.
Um,
but, but the, the main advantage being it's just harder. It's more work.
But the main advantage being it gives you more flexibility and you're less tied to these things.
So it works just fine in VirtualBox or on an AWS instance
or something like that.
We care only that it's a Linux 2.6 kernel or later,
which these days is just everything and um i think like an arm v5 architecture or later or on x36 i think it needs to be like the you know
pentium or or newer so you know it's pretty pretty generic um and uh and and that gives, yeah, just makes it,
we knew from the beginning when we did this
that if people are going to be able to really use this,
it really needs just to work as far as possible anyway,
out of the box.
And because when customers come to us for the first time,
they're usually in crisis mode, right?
It's hard when things are going well and
everyone's happy it's kind of hard to get them to pay any attention to you but but when things
really aren't going well then then then they'll then when they've tried everything else then
they'll talk to us and uh at that point you know if then oh but there's a load of caveats and you
need to change your code or you need to do whatever it is, install a kernel module,
make sure you're doing this, then
they're already in a crisis and they don't really have the time to do that.
And it does allow us to do
some neat stuff as well. When you get into the details
of it, there's some kind of nasty things.
Like, for example,
some of the instructions are non-deterministic.
So most instructions,
if you add two numbers together
and you have the same register state
and the same memory state when you do that,
then you'd better get the same answer
each time you do that.
But other instructions,
read the timestamp counter is one but there are some others um we'll give you a different we'll give you a different result even when you do
them in the same state so being able to you know translate those in the way that we need to is uh
gives us some power and and critically i think probably most importantly allows us to deal with
applications that have shared memory right and memory shared between another process or or memory shared uh
with a with a device um i mentioned that we run on some of the cray and other supercomputers and
um one with one of the very common models there is they use RDMA, remote direct memory access. And what that means is one node on the network will be able to write directly
into the address space of another node.
And you can imagine for a tool like ours, that can be problematic.
But because we do that binary retranslation,
we can actually intercept those changes and handle that case correctly.
So, you know know maybe this is getting
off in the weeds a little bit but in that particular case do you need to be running on
all of the nodes that are interacting or just an individual node would work yeah no just an
individual node will work i mean happily uh there is enough kind of communication with the os
beforehand to kind of set that up because you know obviously if you just allowed any process to
write into arbitrarily into another process's address space then across the network then you
know that would be kind of anarchy so there is a kind of handshake where they say right well this
is the memory region where I'm going to I'm going to use to communicate with and then there's a kind
of you know authentication thing that allows that to happen. So we can see that happen, and then we can know, all right, this is a memory region that we need to treat specially.
Okay.
So to bring it back a bit, you talked a bit about how UndoDB kind of just works.
What is the process like to switch over from GDB to UndoDB for a first-time user?
Simply, rather than using
GDB, use UndoDB.
It's that simple?
It's that simple. And actually, UndoDB itself is
a
Python wrapper
that, like most Python
programs, has sort of grown over the years
from a hundred-odd lines to, I don't know
how many, tens of thousands now.
And
then, yeah, that will invoke GDB in the right way hundred odd lines to i don't know how many uh tens of thousands now but um uh and and then
yeah that will invoke gdb in in the right way so that actually you're using undo db underneath
um so yeah we we make it you know completely transparent or if you're using say eclipse
when we're in there's a if you look in the right place there's a dialog box in eclipse that says
what the path to gdb is and it's expecting just a path to
GDB. But if you replace that with the undo DB command, then Eclipse will use undo DB and be
none the wiser. And of course, it is actually in that, you know, it is using GDB in that case,
right? And it's just with a kind of backend. So we fit between the debugger, be that GDB, DS5, TotalView, Trace32 now, whatever.
We fit between the debugger and the program that's being debugged.
Okay.
So in UndoDB, you can step forward while debugging or step backward.
You can run backward with the program.
Is that correct?
That's right.
Yeah.
So, and you've mentioned eclipse several times what ides do you have that have uh that are have the
integration to support the both go forward and go backward kind of functionality most most of them
do already um so because actually gdb has had this capability for some time.
It's just, it's really, really slow.
So the vast majority of the front ends know how to take advantage of that.
But, and if they don't, then you can at least,
as long as you can at least type at the command line you can at least
then hit the you know you can just type reverse next or whatever rather than than hitting the
button but you can yeah you can usually on on most of them you can find the the buttons we've
actually we've got patches for um a couple of the lesser used one as well because just because
customers like them or we like them so there's the KDBG which is the KDE graphical debugger and we've got some
we patch that for example to have those reverse buttons as well okay but but you
can you can use standard Eclipse Emacs
yeah what else I don't think yeah I're they're the main front ends that customers
use actually eclipse and eclipse and emacs but they have the they've had the backwards buttons
there for some time okay um you mentioned how you know the project is very closely tied to
linux kernel are there any plans to support anything outside of Linux? Have you looked at that at all?
Yeah, plan is a strong word.
But yeah, I mean, I think it's kind of interesting because it just depends on,
the two things it depends on, of course,
is what's the,
well, it's all about the return on investment, right?
So what's the customer demand for it and how hard would it be for us to do and and as a a young company
you know trying to trying to get ourselves established and get our place in the world
it's it's very frustrating but you're forced to be quite short-termist right and i don't know you
can argue about whether that's a good thing or not because actually if you have these long-termist right and i don't know you can argue about whether that's a good thing or not because actually if you have these long-term goals you often find that you've based them on very
on at least some of your assumptions were false and then you it's too big an investment to get
there to learn but anyway for the wrong rights or wrongs of it truth is you know we have to
we have to be doing things that will give return meaningful revenue to us in you know the next one two three
quarters um so it makes it it makes it very difficult to embark on on some of these larger
things like porting across to another platform so for example when we ported across to arm
originally we were x86 only when we when we put it across the arm architecture we did that very closely with um the company and they actually you know funded that effort and we well we did that
because we knew we were going to do the integration into ds5 so it's all kind of part of that uh it
would have been you know almost it would have been basically impossible for us just to decide
we're going to go off and do an ARM version now and then see if that sells.
Okay.
Should we talk a bit about your other product called LiveRecorder?
Does that kind of take UndoDB and goes a little further with it?
Right, right.
So it takes kind of the core recording technology,
but rather than being an interactive tool,
it is a library that you can link against and then that gives and it's a very straightforward c api to do things like enable recording stop the recording
save the recording to a file and some sort of configuration stuff as well so that then you know
if you link against this library it it's dormant by default,
but if you call in to turn on the recording or whatever,
then that means the program can record itself effectively.
And so you can then just deploy that program wherever
and particularly useful for independent software vendors
who are shipping their code
where they don't have any kind of control or visibility into it.
So it might be a classic software vendor
sending their code to somebody else,
or it might be somebody who's kind of deploying code up into the cloud,
but often when your code's running up in the cloud,
it's quite difficult to get that visibility.
And so when weirdness is afoot,
the program can, under whatever circumstances it decides,
start the recording
and then save that
recording to a file. And then you can
take that recording file and load it up
somewhere else, another time,
another place, and you load
it up into UndoDB.
So now you get all of that really cool
stuff of reversible debugging
and that ability to go back and forth, wind the tape back and forth to see what really happened.
In addition, you kind of solve that reproducibility problem.
So you're sort of debugging offline an exact, precise copy of the failure as it occurred in in production and so that's the that's the
what we call the live recorder product so the idea is you're recording those failures
as they as they really happen so what kind of performance overhead does live recorder have
when you turn it on so it it does have a significant, well, certainly a very measurable performance head.
So things generally run, I mean, it does depend an awful lot on the program.
Sometimes it's not really noticeable.
Sometimes it'll run about kind of half speed.
And, you know, sometimes it can be a little bit worse than that.
So most of our customers don't turn on live recorder all of the time. So actually in
in sort of early days we it was it was this internal engineering name for some time was
flight recorder. We decided not to call it flight recorder for a couple of reasons. One well actually
there's some Java stuff called flight recorder and we didn't want to get confused with that but
but also it's kind of a misleading because flight recorders are
always on and only record sort of certain interesting things right um you know this
the flight recorder is not going to tell you what the passenger in flight 43c had for lunch
which when a plane crashes probably doesn't matter but when it's software it might it might
be the problem right and and so you know so this
just really records everything um uh or at least and that records i say records sufficiently for
you to be able to reconstruct everything so we don't we only record the non-deterministic inputs
but um uh where was i going yeah and and but and but, but because of that performance overhead,
usually you don't have it always on.
We do have actually customers who put it always on
because they have a,
if they've got a low CPU overhead application
that is exhibiting,
we've got one where the customer sees it crash
like every, like just literally a handful of times a year
and they just,'s really really hard
to diagnose it's not very high percentage cpu information so they've they've quite recently
deployed with that enabled all the time let's wait for it to crash and then we can find out
what's happened but most of the time you just you turn it on when it's going wrong so either you
know maybe it's a node in the cloud and you know that's sporadically these problems so we'll just turn on one in a
hundred instances or something uh or um you've got a customer who's complaining because this isn't
working in their environment of course we've all been there customer says well you know i do this
and it breaks and you try and do this and you say well it works for me um so now you can say well
okay turn on the recording and how that happens is left as an exercise to our customers. So maybe they have a menu config option inside, or maybe they just can tunnel in and do it remotely, whatever.
And to get that recording on and say, okay, now do that again.
And now send us a recording file back.
I'm thinking about the way you've described how this is implemented, that you are recording all of the non-deterministic inputs
and doing some form of jitting of the application code.
And I'm thinking there's other things
that you could be doing with this data
other than debugging your program,
like memory profiling or CPU profiling.
Are there other applications
or are any of your customers abusing the product
in ways you didn't intend for them to?
Yeah, absolutely. customers abusing the product in ways you didn't intend for them to or yeah that's it yeah absolutely
so um uh it actually opens up some really interesting possibilities ones that we didn't see
in the early days in fact i i remember back right at the very beginning when i was kind of thinking
you know should we do this and i i spoke to a few people and found some of the the smartest people
that I knew
and ran this by them as an idea.
Do you think this idea is crazy or do you think it might just work?
And there's one guy I worked with years ago, really, really smart guy,
and I ran it by him.
He said, yeah, that sounds like it would be great.
He said, I think there would be lots of other applications beyond just debugging as well
if you can wind time back.
And I said, oh, wow, gee, Pete, that's really interesting.
Can you sort of elaborate on what that would be?
And he said, well, I don't know.
I can't quite give you any examples, but I'm pretty sure it's there.
Thanks.
Thanks, Pete.
That really helps.
But actually over the years, and particularly with this is with, you know, interaction with
customers, I've realized, oh, I see.
I see what he means.
And so, yeah, we've had – I mean, I can't sort of get into too many details, unfortunately.
But we've had in the last sort of six months, I think four of our biggest – all four of the top four biggest customers have come to me
and said, Greg
we love your stuff
one of them said it's like heroin
I think he means it
he just needs it all the time now rather than it ruined
his life
but
this is really really great
for debugging but do you know what
could we use it to do X where X is solve a particular kind of business application problem that they're facing?
That there's no way that I could envisage because I don't understand their business, right? And that's as diverse as from EDA companies, running simulators and synthesis tools,
to banks, trading applications, across the board.
And they have specific things where just this ability
to know what really happened is just fundamentally really powerful.
And then to be able to start to do kind of analyses on those recordings and put that together.
I think there's a whole heap of really, really interesting directions this can go in.
And of course, we come back to what I mentioned just before that we're trying to grow the business here and ultimately that comes down to decisions
about what will what can we do that will generate revenue in in the short term and also has a higher
the highest probability of generating revenue as well so it kind of limits you know i'd love if we
would have a a research department with you know 100 people in it that would just go off and do
all this crazy stuff and
and we'd probably find some really really amazing things in there but realistically i think it's
going to be a few years before we can actually do that so you talked for a second there about uh
how your one customer was comparing it to heroin uh it definitely seems like the type of thing
reversible debugging where once you've had it fix some hard to find hard to debug problem for you once you would get really hooked on it um do you think reversible
debugging is kind of becoming more of a thing in the industry like more and more developers are
using it it's becoming more popular because i definitely see that you know maybe 10 years from
now it's just going to be like the standard that everyone uses reversible debugging all the time yeah i i think so i mean that's why you know that's why uh that's why we started the
business right sure we could we could see um that this was if you like an idea whose time has come
um it it's just a much smarter way to do things it however it is i must say frustrating how
slowly things change and people's habits change and how um you know i remember when we so when
we launched version one which as i said it's more like sort of 0.1 but it was a thing and it worked and um
i had a friend who a few years ago had had his back when i did os research he'd had his new os
posted on slash dot and and the servers the web server had crashed under all the traffic right so
um i'd be warned our web posting providers well we're going to launch um this this product and
we're going to put a press release out and you know probably not but just be aware just in case there, just in case there's a kind of, you know, there might be a bit of a spike in traffic.
And they were like, yeah, okay, whatever.
I've probably heard that before.
And we launched, and just like nothing happened, right?
I mean, it was real.
You just couldn't get anybody to see it or to care.
And slowly, slowly over the years, it kind of people one by one by one started using it.
Those that use it,
whether that's using Rstaff or other
implementations of this,
it's completely
addictive and they couldn't
imagine working any other way.
It just takes so long for people
to change habits. i think i think still
the predominant way people debug is with printf and and i think there's something about something
about i think that you're when you're debugging your mind is already completely full you're
already you know that quote from is it b Brian Kernighan, I think,
the quote that debugging is twice as hard as writing the code in the first place.
So if you make the code as smart as you can, how will you ever debug it?
And it's kind of related to that in that, you know,
so when you're debugging, you're just like your brain CPU load is at 100%.
And so new things, trying to find new ways,
even if they will make you much more efficient,
it's just hard for people to do.
So it's frustrating how it is.
I mean, it should be just like standard thing
where it's available.
I know it's not available everywhere,
but where it's available,
it's like it's a constant source of frustration to me
that it's not just,
why isn't absolutely everybody using this stuff?
But it will take time and habits will change.
And you'll always have those dyed-in-the-wall printf debug people.
But, yeah, so I think, I mean, I'm not well-known for my patience.
So I think probably when you say 10 years from now,
sadly, you're probably right right it probably will take 10 years
uh jason do you have any other questions you want to ask um no i don't believe i do okay well greg
where should people go if they want to find more information to get started with undo db or just find more about you yeah so uh undo
dash software.com um and that's that's the definitely the first place to go and um uh
you know everything is is there on the on the products and that's probably the best you know
there's not a lot of information on me i don't think there is much of a way to find out about
me i think this this this uh podcast has probably been as much exposure as I've had some in the last few years. But yeah, you don't care
about me. You care about UndoDB. Okay. Well, thank you so much for your time today, Greg.
Thanks a lot. Thanks for joining us.
Thanks so much for listening as we chat about C++. I'd love to hear what you think of the podcast.
Please let me know if we're discussing the stuff you're interested in,
or if you have a suggestion for a topic, I'd love to hear that also.
You can email all your thoughts to feedback at cppcast.com.
I'd also appreciate if you can follow CppCast on Twitter,
and like CppCast on Facebook.
And of course, you can find all that info and the show notes
on the podcast website at cppcast.com.
Theme music for this episode is provided by podcastthemes.com.