CppCast - Regression Testing with Touca

Starting point is 00:00:00 Episode 305 of CppCast with guest Pejman Gorbunzah recorded June 23rd, 2021. Today's episode is sponsored by Incredibuild. If you're like me and don't like waiting for long C++ builds, tests, and analysis results, then you should check out my friends at Incredibuild, because expert developers don't wait. In this episode, we talk about changes to the format library and some conference news. Then we talk to Pejman from Tuca. Pejman talks to us about Tuast, the first podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today?

Starting point is 00:01:23 I'm all right, Rob. How are you doing? Doing okay. I did see we had a suggestion about, I think we were talking about possibly changing our show intro the other day, and I think someone suggested we refer to it as the contiguous podcast for C++ developers. There you go. Because we've been pretty interrupted, at least for the past two, three years.

Starting point is 00:01:43 Yeah, yeah. I mean, well well even if we miss an episode every now and then we're still managing at least 48 or so per year i think last year definitely was every every week yeah yeah last year was yeah although i mean let's just uh say like it's it looks like life is starting to return for some of us and travel and that might cause some interruptions yeah that's certainly worth mentioning uh you know travel and and your job as a trainer is starting to come back you know very slowly but it is yeah yeah um and you know we we before have had some guest hosts on the show so we might have some guest hosts again, depending on your training travel schedule.

Starting point is 00:02:27 So, yeah. Yeah. And we'll talk a little bit about that actually in the news items. Right. Right. Well, before we get to that, at the top of every episode, I'd like to read a piece of feedback. So I got these two comments on YouTube to our ABI episode. One is very, very long, so I'm not going to read the entire comment. But this first one from Tomek says, Correct me if I'm wrong, but in this discussion, the most important argument used against breaking an ABI was that Apple doesn't support library versioning. And then there was another comment which kind of echoed the same thing.

Starting point is 00:03:04 But what they pointed out was that in the Apple ecosystem, And then there was another comment which kind of echoed the same thing. But what they pointed out was that in the Apple ecosystem, Apple changes things on their developers a lot. Like they deprecated OpenGL and introduced Metal. And this person is basically making the argument that, I guess, Apple developers are used to things breaking and having to rewrite stuff. So it really wouldn't be that bad on the Apple developer ecosystem to have an ABI break. That is the part of this that just kind of fascinates me, I guess. And a lot of people echoed similar comments. Yeah, after the episode with Marshall.

Starting point is 00:03:39 Yeah, because one of the projects I've been involved in for the last decade, about once every two years, we have a conversation that goes, so Apple just released a new major version of Mac OS. And we go, uh-oh, what broke? It's like guaranteed. And it is extremely unlikely that an update to Windows breaks our stuff or even an update to Linux. We just have to recompile on a new, you know, if we move from Ubuntu 18.04 to 20.04 or whatever, we just have to recompile. Right. Yeah. It certainly struck me as an odd thing to be, you know, installed by the OS and I don't know, not be versioned. Like you could easily have multiple versions of the STL, I think i don't know yeah uh yeah okay the dead horse i guess at this point yeah we certainly made our opinions known uh yeah well we'd love to hear

Starting point is 00:04:34 your thoughts about the show you can always reach out to us on facebook twitter or email us at feedback at cppcast.com and don't forget to leave us a review on itunes or subscribe on youtube joining us today is pejman gorbanzad pejman is the founder of tuka helping engineering teams understand the true impact of code changes on the behavior and performance of their software until three months ago pejman worked as a senior software engineer at vital images a canon group company on the vitria software for advanced visualization of medical images before that pejaman worked at vmware carbon black building and maintaining the mac os endpoint for the cb defense product in his free time pejaman enjoys going on walks and bike rides around the gorgeous lakes of minneapolis which he calls home happily during summers and resentfully the rest of the

Starting point is 00:05:19 year pejaman welcome to the show thank you so much for having me. You know, it's funny because I looked up Minneapolis recently, partially because they're a major Delta hub, and Delta is the airline that I fly on so often. Like, would it be fun to move to Minneapolis? And everyone agrees that it is an outstanding city. It's one of the best cities in America, if you can put up with the winners. That's right. Yeah, I agree with that. And I'm sad to say that I don't think

Starting point is 00:05:47 that we're doing a good job putting up with winners here. So we complain all the time, and we might move in the future. But for now, we call it home. Well, newscasters, I find, are like weathermen or whatever, are just terrible everywhere. They're like, oh, like oh well finally we'll

Starting point is 00:06:06 break this cold spell and it'll be a nice warm 90 next week and then we have 90 for like three days and they're like this 90 is getting overbearing it's time for a cold like you people are just you can't make up your minds for what sounds like good weather that That's right. Yeah, I agree. Sorry. Okay, well, Pajman, we got a couple news articles to discuss, then we'll start talking more about Tuca, okay? Sounds great. Okay, so this first one, we have a GitHub tool, and this is Palantir, which is a set of lean and efficient tools to improve the general software quality for C++ and Python programs. So it looks like just with a little bit of

Starting point is 00:06:48 instrumentation, you can hook up, I believe it's a header-only library, right, Jason? I don't know. I didn't get that deep into it. I kind of got mesmerized by the animated GIF here. Yes, yes. It is a header-only library as far as I read. Yeah, it looks like they have some pretty cool instrumentation coming out of it, and it looks like it's pretty easy to set up. Like I said, just include the header,

Starting point is 00:07:16 and then you just have to make a bunch of function calls to put in the instrumentation and then rebuild with it. Looks like something I have to try, though. I know a lot of our listeners will know what we mean if they go and look at the GitHub repository, but it's clearly a well-put-together mGUI interface into visualizing the instrumentation that it collected. And it can do, what, timing data,

Starting point is 00:07:45 memory usage data of some sort. And then it can apparently somehow also can tell you if things are waiting for locks and for how long and whatever. And it almost looks like magic. Like, I have to see, like, how much work does this actually take to put into one of my projects to see what it's doing. I definitely have to give it a try. Yeah. Okay.

Starting point is 00:08:03 And the next thing we have is a blog post from Zverevich, who of course is the author of the format library. And this post is a quest for safe text formatting API. And he's, I guess, been trying for a while to figure out how to get compile errors if you don't put in your format strings properly like if you you know put in a string expecting a number and then you put in

Starting point is 00:08:31 some text that's not a number um but it looks like he finally found it using constival from c++ 20 and i loved how the the code formatter still don't. I was completely random, but the syntax highlighting JavaScript widgets and stuff don't yet recognize constyvel as a keyword, so it doesn't get highlighted. But that's really an aside. I liked this article because, well, more compile-time things are always good, particularly compile-time type checking. And I want to say that it was possible to do this before V8 of the libformat release,

Starting point is 00:09:10 which I think was two days ago. It was possible, but this is certainly more elegant and it seems like the right solution than using macros such as fmt-compile. Well, and it's the default now too before you that's right you have to opt into it before yeah and now if you actually want runtime formatting you have to explicitly say this is a runtime format string not a compile time format string yes it makes a lot of sense to me yeah i'm fully on board with that. If you pay attention to security,

Starting point is 00:09:45 what are those things called? The security alert things, whatever those papers are called. Anyhow, so many still to this day are like, you put in the wrong format string or a runtime determined format string into printf or something like that. The one thing to note here, and he pointed it out as something I guess he would like

Starting point is 00:10:10 to improve, is it's a very long, verbose error message when you don't have the right format string. But it gives you the information you need. It tells you it's an unexpected or invalid type specifier. But it's like a wall of text when you use this wrong, I guess. At this point, we're all used to it, though. Yeah, we are. That's true. Yeah, sad but true.

Starting point is 00:10:37 I want to say that this version 8 release has so much features and improvements that I was very excited to read the changed notes. Anything else worth highlighting? Because we haven't gone through the whole release yet. Well, the list is very long, but the one thing that caught my eye is the compile time formatting of the strings for fmt print statement, which is surprising at first, but it seems like it's actually improving it by 2x,

Starting point is 00:11:12 which is surprising again. Wow. Yeah, impressive. Oh, and he does also mention here that even though it is adding some new code in compile time with const eval, that it wasn't much of a performance hit for build times. That's nice. Yeah, within the margin of error between build time measuring anyhow.

Starting point is 00:11:33 Yeah. All right. So now we have a couple articles of conference news. This first one is C++ on C. And I think the major announcement here is the format in which C++ on C is going to be happening this year. What was the name of the platform

Starting point is 00:11:51 that everyone was using last year, Jason? Oh, Remo. Remo, yes. Yeah, so instead of Remo, it's going to be using Gather Town, which looks pretty neat. It reminds me of old school RPGs where you walk around on the map like like zelda or the game boy games maybe yeah um so that's kind of used that excuse

Starting point is 00:12:14 me not to be con c++ now used that this year used gather town yes oh okay well it looks pretty cool it is that's a nice substitute for the hallway track and that's uh the conference starts let's see when this airs it'll be starting in five days oh it's just next week's okay yeah and so you know if you're interested at all be sure to go ahead and sign up everything's announced and uh check it out yeah very cool and then next one uh is CppCon 2021 have their call for submissions open. The conference is going to be October 24 to 29. And it is going to be both face to face and online. And I just went through the submission process for this. So if anyone's interested, you can choose if you want to do in-person, online, or if you're flexible. Okay. And then lastly, NDC Tech Town also has their call for papers open, which ends July 1st,

Starting point is 00:13:12 which is coming up real soon. And the conference is going to be the 20th to 21st with some workshops from the 18th and 19th. Yeah. And I am definitely giving a workshop at that conference, which is, yeah, that's going to be October, right? So it's literally the exact week before CBPCon. So it'll be a little bit tricky getting back home in time and everything. But at least I don't have to go somewhere completely different between the two conferences.

Starting point is 00:13:42 Right. So I think it's worth it. Oh, go ahead. Oh, sorry. I was going to say, Pejman, do you have any plans to go to any conferences this year? I'd like to. I don't know if I have time to submit a quality talk for any of these conferences this year. But yeah, just having an in-person meeting is something we all craved for last year.

Starting point is 00:14:04 So it's definitely nice. One thing that I have noticed is that this Zoom meeting and, you know, the virtual meeting format has at least been very inclusive. I like that, you know, future conferences would kind of have that in mind, at least make it possible for others who are not necessarily physically able to attend conferences to have a way of joining discussions. Yeah, that was certainly nice about CppCon. I know I met some conference goers from South America who I'm sure would have had a very difficult time getting to CppCon in person. And since you said you didn't know if you'd have time to make a quality submission, I think it's worth pointing out that there's lots of helpful people on the include C++ Discord that if you have a submission that you would like some help reviewing it, there's lots to be very good at that.

Starting point is 00:14:55 Yes. Yes. And people ask me from time to time to review their conference submission. Please do not ask me. I am terrible at it. Every single conference submission that I have personally reviewed has been declined for a conference so you do not want my help okay it's true nope nope that's it although it's worth pointing out that cbpcon is uh so c++ on c virtual only cbpcon hybrid a tech town in person only i believe yeah i do hope the the hybrid

Starting point is 00:15:26 conferences uh do work out though so you know like you were saying pejman a more inclusive audience is able to uh attend i guess yeah all right so uh pejman we we mentioned your article two weeks ago i believe on the show talking about tuka uh Do you want to start by describing it kind of in your own words? Tell us more about it. Yes. And I think you did a marvelous job in explaining how that is. So thank you for that. I guess I start by explaining the product that I have right now and spurred my vision for the company, Tuca. So the product right now is an automated regression testing system. It has two components. One is a remote server that basically serves as the component

Starting point is 00:16:17 for automating all these processes, such as executing tests and comparison of the test results with previous versions, and then reporting them. But there's an open source component separate to that, which is the client libraries, a set of client libraries for users to integrate with their code. And the idea is for users to be, to describe the behavior and performance of a particular version of their code under test and then submit that description to this remote server

Starting point is 00:16:53 that stores the description and then compares that against a prior version and then reports in real time the differences that it finds between the two versions. And that real-time piece is what is the most interesting part, in my opinion, because the entire purpose of this product is to facilitate for software engineers the task of maintaining large mission-critical systems. And one thing that I feel is slowing down the process of maintaining these systems is the long feedback cycle between the time that we make a code change and the time that we understand the true impact of those

Starting point is 00:17:40 changes. So the idea is by giving providing real time feedback to software engineers as they are writing code, we would be able to help them identify, you know, how their code changes are impacting the overall product, and therefore help them catch bugs, essentially. So you have, trying to build my mental model here, I have my normal CI build system that's compiling as I push up changes. And then whenever it runs its test, then it's just also shoving it over to your server at the same time? Is that the normal model for how this would work? That's right. Right now, that's the model. You integrate the execution of the test with your CI, or you might have a dedicated test server. Right now, the running of the test is up to the users, because there are so many environments to support and all that. But the idea is, as soon as you start running the test with all these arbitrary number of test cases, for each test case that you execute,

Starting point is 00:18:53 you can describe the behavior of your software and runtime of your software by capturing the values of different variables, interesting functions, and then pass or submit that data to the remote 2K server. And then you can immediately see that it is comparing it against prior versions and then showing you the differences. So it doesn't even necessarily have to be pushed to your build server. Even during development, when I run the test suite, I can get immediate feedback. Exactly. Yes. So that sounds immediately very helpful.

Starting point is 00:19:33 Yeah. So again, like, so that's the product right now. But my vision for a company and the startup is to just fundamentally change the way that we test software by giving this real-time feedback about the impact. I guess the ideal case, if we could magically solve all the technical problems, is that as you're writing code, we would have tiny squiggles in your editors that would say, hey, you're forgetting this particular test case

Starting point is 00:20:04 and you're mishandling this path you know, path in this coaching that you're making. So we're not there yet. But we're close. Yeah. Well, no, I mean, even what you've described so far, because I was actually just doing this yesterday. One of my one of the branches that I'm working on has a regression that's reported on the dashboard. And so I have the baseline build on my system where I've run the test and I have the current branch checkout on my system where I've built it and I've run the test. And now I'm manually trying to look at that. I mean, we have scripts for it, but it's still a relatively manual process running it locally, because I'm not set up to have all these things were normally locally. And then, well, of course, I'm unable to reproduce it on my local system. So

Starting point is 00:20:55 I don't know if your software still wouldn't help me in this case, I don't think but yeah, it definitely doesn't solve every problem that we face as software engineers. But I personally have had this issue of spending tons of time just setting up the process for doing basic checks about the behavior of my test or the performance of my test. And it feels very clunky. At this day and age, it seems that we need better tools to help us, you know, basically do the decision making other than, you know, just the labor of comparing result files and, you know, comparing the output of our software. Certainly the output to me is just, it doesn't have the necessary information for us to act on it. Right. So what kind of instrumentation has to go into the tests in order to, you know, take some data out of the tests that get run and push it up to Tuca? What does that look like? That's a very good question. And I think it depends on how users want to test, you know,

Starting point is 00:22:01 run the test. There is certainly, it is possible for Tuca to just be a development-only dependency for your product. In this case, it would just be a testing framework, in a sense. So you would use it in your test environment to capture the results, but then when you are shipping your product, there is no dependency on TuOP. That works just fine. We have users that use that path.

Starting point is 00:22:30 The alternative approach is that you want to actually capture the internals of your software as well as its output. In that case, then you can integrate the client library, which is very thin. It's open source as well. So then basically, by integrating that to your product, you can capture these variables. Now, one thing that I'm very proud of

Starting point is 00:22:55 is that those functions that you add to your production code to capture the values of variables or runtime of functions, those will be no op in production. So the client library is designed in a way that it doesn't have any impact on the performance of your software when you've shipped your software. They're just not doing anything. They're skipped over.

Starting point is 00:23:22 Now, however, when you are actually kind of running that code under test from a test environment, they are skipped over. Now, however, when you are actually, you know, kind of running that code under test from a test environment, in which case you have configured a client library, then all those functions would come to life and they capture the results. And later on, you can submit them to the Tuca platform. Does that answer your question?

Starting point is 00:23:44 Yeah, I think so. Okay. One other thing that I want to add is, unlike unit testing or snapshot testing frameworks, I am basically redefining how we write. I wouldn't say that I'm going to replace them, but as a complementary way of testing, we are basically running the workflow

Starting point is 00:24:06 under test and not having any kind of assertions to some expected value. So that makes it a little different than a unit testing library, where you have test cases and you say, for this test case, I want this particular output. In this case, we just want to run the test, you know, grab the information that we want, and then pass it to the platform. So the writing of the test requires a separate executable than your unit test frameworks. However, it's very simple, and doesn't have too much to it, other than just calling your workflow under test. So yeah, what kinds of data can we submit? What does the data format actually look like? So anything that your variables can hold, you can send them to the Tuca server. So another thing that makes this a little different than, say, a snapshot testing framework is that we don't take this data in a string format.

Starting point is 00:25:11 We don't serialize that into JSON and post it to the platform. We serialize it to compare your data in the original type to prevent lossy comparison. And therefore, if you have a floating point value, for instance, you're going to serialize that into a floating point binary representation and then compare it in floating point itself. So that's the technical point. But I think if you are wondering

Starting point is 00:25:49 if I can, say, compare images or videos, technically that's not possible right now. You can compare the binary representation of those videos, but that's not something you want to see. So right now, well, as in my defense, Suka is only three months old. So it is missing a lot of features in terms of you know, in the in the UI for comparison of more complex types of data that we care about. Right. Okay, so like a real world application of the regression test that I was just talking about

Starting point is 00:26:25 having to try to, well, we'd like I said, we have scripts that will find the diffs for us. I just don't know why I have a diff because I can't reproduce it locally. But again, that's not your problem. Yes. If what we what we have is large, like CSV tables, and then our diffing scripts, each they, they're very specific to the type of data they know that they're comparing. So they load up all of these CSV tables, and then do a comparison of the grid, and then say, well, on this line, this column, whatever, this is where you had a floating point diff that was outside of spec. Now, it sounds like because this is just a bunch of floating point numbers, this is something that I could submit to your system,

Starting point is 00:27:07 but I would need to submit each one with a unique ID or something like that? No, so that's, again, up to you. You can submit all of them as an array of floating points or a matrix, right? So that's possible. As another approach, you can actually just submit individual cells of that CSV table. So that's possible too.

Starting point is 00:27:33 By submitting basically a chunk of data as a single entity, then it would be flagged as different if any of those floating points in that chunk is different, right? So that's how you decide if you want to break your data down into smaller pieces or not. Right. So I could see, like, in our particular use case, a vector for each row might be appropriate. And then we would know, okay, this row. Makes a lot of sense, yes.

Starting point is 00:28:05 Interesting. Okay, go ahead. I want to say, that is probably a good segue to how to get started. I was working at this medical software company called Vital Images. We had six million lines of

Starting point is 00:28:21 code for this product I was working on, and my task, certainly, was to basically work as part of a small team to rewrite or replace one of the most critical components of this software that happened to be the lowest level component as well, which is the consumption of data, the ingestion of data. So the product that we were building was a visualization software for medical images such as CT scans and MRIs. Now, these are data sets. Usually data sets are a bunch of files. Every file corresponds to a 2D image.

Starting point is 00:29:05 So our software takes these data sets that can be up to several gigabytes and then creates this 3D constructs and then pass it to higher level algorithms to process it and then render it semantically for the users. Now, the problem we had was we wanted to replace the component that loaded these datasets from files and then create this 3D construct that was consumed by other algorithms of our product. And this particular component was 500,000 lines of code. So we could not replace it overnight. It had to be done gradually. We could not maintain a branch for a year. So therefore, we had to make changes to this live product that would be released to

Starting point is 00:29:54 our customers and then have a way of making sure that those changes do not have negative impact in the safety and accuracy of our product. So that's basically how I was interested in this particular challenge of how do you do that? How do you make sure that you understand how your code changes are impacting the overall product? And we certainly broke the product a lot. Every time it felt that the tooling is one of the issues here, because certainly you can't blame a developer for not having, you know, perfect understanding of a million lines of code and how they interact with each other. So in this case, I started by basically seeing how we can, you know, compare the output of our tool, but then soon realized that the output is not enough.

Starting point is 00:30:46 We want information of how that import is shaped throughout this 500,000 line of code. So therefore, I had to kind of redesign the normal snapshot testing framework mindset and then change that in a way that we can capture as much data as we want from anywhere in our code that we want. And then basically make it possible to compare these at a scale because we certainly don't want to test a medical software with one or two test cases. We had 16,000 data sets that we wanted to test and it was a non-trivial problem for us to keep track of result files. So we had to come up with a way of storing them and certainly managing them in a separate server.

Starting point is 00:31:34 So yeah, it kind of sounds... Oh, I'm sorry, go ahead, Rob. No, go ahead. It sounds like you're trying to fit somewhere between unit testing and approval testing? That's right, yes. So the current product is kind of like an approval testing framework, or in the larger software engineering industry,

Starting point is 00:31:56 it's called snapshot testing, because there are tools in other languages that use that term. But essentially, I think the primary distinctive feature is that snapshot testing frameworks, they generate files. So the description of your product would end up in a file. Now, if you have 100 test cases, you end up with 100 files. Now, the common practice is to check in the trusted output of your code, which is called snapshot files in this case. You check them in to your version control so you have them handy.

Starting point is 00:32:36 So every time you run a new version of your code, you compare against those files. The problem that I see with this approach is when you make a code change that makes changes to a bunch of these files, then you need to individually inspect the differences in each file. Now, if you have too many of these files, that would be time-consuming, and it's certainly not elegant to have to compare a tiny little difference, for a tiny little difference, for a tiny little difference, all these other files to make sure that that's the case. So that's one issue.

Starting point is 00:33:10 The other is just checking in of these files into version control. To me, it's just a workaround. It's not... It's not a version control before, yeah. Exactly. It's not intuitive. So it seems that we need a way of capturing data in a way that enables us to capture them at a scale. And lastly, snapshot testing frameworks, they usually use the output of your test. So they've worked with whatever you print as standard output, or you kind of write into some file.

Starting point is 00:33:46 So that, to me, is also a blocker. Because certainly, at least for the code changes that I've been involved in, the pieces of code that I've broken over the years, the issue was not the output of the test. The issue was somewhere in the internals of my code. And we need a way to be precise about what we capture and want to compare. Okay, makes sense. I wind up the discussion for just a moment to bring you a word from our sponsor.

Starting point is 00:34:16 As C++ coders, I think you can say we've gotten used to waiting. Waiting for code to be compiled, tested, and even analyzed. My friends at IncredibBuild have been working decades to help developers like you and I code more and wait less. Because expert developers don't wait. With IncrediBuild's acceleration platform,

Starting point is 00:34:34 I've easily gotten up to 10x speedup on real-world projects right out of the box. What I like most is how easy it is to use. Their virtualized distributed processing technology turbocharges product development with just a lightweight IncrediBuild agent. No need to install any build tools or source code. Thank you. more HPCs and developers without adding a bunch more HPCs or developers. Go to incredibuild.com slash cppcast to start your free trial today. That's incredibuild.com slash cppcast to start your free trial today. So one question I had is if you're already using, you know, a unit test framework, you know, whether it's Google test or catch or anything like that, can you continue using that? Or are you going to be, you to be rewriting all those tests using Tuca?

Starting point is 00:35:28 I don't think that there is any value in rewriting any unit test. It seems that if you write unit test, it's usually for a good reason. And therefore there is no reason why we want to change that. So unit testing in general, I think, is very helpful and very effective. The problem is sometimes we cannot write unit tests for complex workflows that have complex inputs. At least, I don't know of a good way of doing this. So if you have a software that is fully covered with unit tests, I think you're in a very good position. But if you are dealing with a software that cannot be effectively tested just with unit tests,

Starting point is 00:36:12 then I think Tuca can be a complementary solution to kind of fill that gap and then have a way of reporting the differences for all of the team working on that product. I'm thinking like, there's a micro benchmark framework that's part of catch two now also. I don't think it's very well publicized, but there's it's been in catch two. So let's say, I run this micro benchmark framework, and I capture an output value from it. And I want to push that into Tuca. And so I'm basically want to use Tuca to capture the trending performance of my micro benchmark. Can I do that?

Starting point is 00:36:58 Yes. Yes, you can do that. But you certainly can have that output. As you are running your cache to test, you can basically take that output and then pass it to the Tuca server. So in this case, you are integrating the Tuca client library with your unit test executable. So that's possible.

Starting point is 00:37:24 And I think I like that approach more than having your output stored in a file and then later have a separate kind of process for submitting that output. Right. And can Tuca tell you that, you know, some numbers trending down or trending up or staying flat? Okay. Yes. So I guess at this point, the analytics that we provide for the behavioral differences is slightly more mature and more helpful

Starting point is 00:37:55 than for the performance. However, Tuca can show in its UI the trends of differences for different functions over different versions of the code. And I have found it very helpful because usually you want to look at a chart and see that over time, your tiny little code changes kind of amount to a performance hiccup or a slowdown in a piece of workflow. So all it does right now, the interface is just giving you a chart,

Starting point is 00:38:33 showing you how the performance changes over different versions. For many things, that's good enough, yeah. Yes, I think that might be enough for people to find value, but I have much higher ambitions in terms of how to make it more helpful for users. In this case, I really think that tracking performance is slightly more tricky than behavior because it usually has some noise. It is not guaranteed to stay exactly the same. So you want to have a way of looking at the performance changes

Starting point is 00:39:14 as a human being looks at it, right? We look at a chart, we say, oh, well, this dropped 1% or 2 for this particular version, but we don't care as long as this is not consistent over a few versions. Should we talk more about the actual Tuca website interface and what that process is like? I did watch one of the videos you put up

Starting point is 00:39:36 that was linked to that article two weeks ago. Awesome. Yes, so again, I think the Tuca server helps you not just store your results, but also compares them against different versions. Now, the comparison certainly has some output in the differences. So the server attempts to visualize those differences. And the visualization of those differences is something you can look at for each given version. As your test is being executed, you can go have a look.

Starting point is 00:40:13 But then after all the test cases are executed, the server is going to send a notification to members of your team and say, hey, this particular version that you just ran has these differences. So that report that it generates is more high level because it's meant to be consumed by everyone to give them an insight into how

Starting point is 00:40:37 this particular version has behaved. So you can look at that email or a Slack notification and say, well, this is not what we expected, so let's go take a look. But in some cases, you may make intentional changes to the software, in which case you would use the platform again. You go there, you click a button and say, from now on, this is the version that I trust, that my software should behave from now on. So that is also kind of automated, the way that all these versions are basically tracked. You can add comments to kind of communicate with your team members

Starting point is 00:41:19 and then have audit logs for how your software is evolving over time. Do I need to inform the Tuca server that I'm going to add some new variable or value, or can I just do that ad hoc? You can do that ad hoc, yeah. In fact, you can do that with test cases as well. You can add as many test cases as you want at any one time. The advantage of having this remote test server is that it is aware of the test cases that are part of your baseline version, right? So the advantage for that is you can then not have

Starting point is 00:41:54 your test cases listed as you're running the test. So a normal Tuca test code doesn't specify its test cases because when you have access to the library, the library can query the list of test cases from the server. And therefore, you don't have expected values and you don't have the list of test cases. All you have in that regression test code is how you run your workflow under test and then what data you want to capture. Okay. I don't think we've mentioned yet what platforms, compilers are supported.

Starting point is 00:42:31 Yes. So the C++ client library for Tuca has support for all platforms. So the, well, all known commonly used platforms, right? Windows, Mac. Linux, and... So it can work on my Commodore 64 code?

Starting point is 00:42:47 I am very, yeah, I am regretting having said that. But in general, I've basically, I think because the end users might be using code bases that are somewhat dated, I wanted to support a wide range of

Starting point is 00:43:04 compiler versions, a wide range of standards. So starting C++11, it kind of covers, you know, the relatively recent versions of the C++ standard. And then the compilers, I think GCC8 and Clang 8 are both supported, as well as MSVC, I think, for 2013 is supported, which is giving me a lot of headache, to be honest. Yeah, I wouldn't... You might drop it.

Starting point is 00:43:41 No. Yes. You might drop it going forward. Yeah. That's right. No. Yes. You might drop it going forward. Yeah. That's right. Yeah. I think 2015 would even cover most of the current console dev kits, actually. That's right.

Starting point is 00:43:54 However, because this started as a side project and then kind of obtained as an internal tool at my former employer, I had to make it possible for us to use it with, you know, the most outdated codes that we had, which happened to be using MSVC 13. So I feel like I will should ask this, or I'm very curious. Anyhow, you mentioned in your bio that you recently left your former employer to pursue this. And then you've also just publicly said that you began development of this project at your former employer. So you left with their blessing, I assume. Yes. So for our listeners. Yeah. So when so at least in the US, I think the norm is that when you start working somewhere, all your future IP, developed IP, would belong to the company that you are working for. And that seems a little weird for others, but here we all understand that.

Starting point is 00:44:57 Now, the problem was because this was started when I was working at my employer. So I basically had to work on it only at nights and weekends. But then because I was so obsessed with it and everybody knew that I'm working on it for so long, they all had my, you know, kind of their blessing. So I got the IP rights and that's only when I was able to start speaking with people and showing the product to them outside the company.

Starting point is 00:45:31 And the way you had described it, I thought that you had actually started development on this, on the company's dollar, which would have the, they would own it, I think, in pretty much any country if that had been the case. That's right. Yes right yes yes but this was a side project uh and for me it was very important uh to kind of separate the work that i am uh doing for this project with all the other work that i do um for the company you know right tasks okay um so yeah you said you've been working on this i think for two or three months uh kind of on your own um full time yeah how well has that been going like are you getting like funding Jeff users no i don't have funding yet um and i'm not seeking funding right now i'm uh i'm seeking my first 10

Starting point is 00:46:18 customers uh 10 paying customers i should say um i have, since leaving, you know, YDL, I have now about 10 active individual users that are using the product on a weekly basis. Now, there are also a couple engineering teams that, you know, are evaluating the product for larger scale use in their organizations. Now, I still don't have any paying customer, and I'm actively working on finding my first time. I want to say right now that my primary objective is to get as much feedback as I can from actual people who have this problem, the problem that we had at my company. So the conversations I've had with software engineers

Starting point is 00:47:14 over the past three, four months has been extremely interesting. Just to find that we all are suffering from the same issues is kind of giving me, you know, goosebumps. But in general, it's also very motivating in terms of, you know, the possibilities of improving what can be improved in terms of how we maintain software at large scale. Well, so if listeners are interested in trying out the project and, you know, becoming a user, paying or, paying or otherwise, what kind of options are there? So right now, the platform, so I should say that the remote Tuka server can be deployed on-premise as well as a cloud-hosted version.

Starting point is 00:47:58 So the cloud-hosted version is deployed at tuka.io. Listeners and anyone who is interested can actually go create an account for free. They can use the client libraries for C++ and Python to submit test results. So everything is free right now. But once they want to use it for teams of up to, say, more than five members, then right now there is a pricing plan, but it's still like, I don't want it to be a blocker. I want to just have users that are using the product and not worry about paying. So in general, everything right now is as free as it can be. And I'm going to keep the free plan for the Tuka.io platform forever permanent.

Starting point is 00:48:53 So there's always a way of working and getting value out of this product without paying anything. Okay, very cool. So like I mentioned before, the project i'm working on already has a regression solution that does work uh so i don't know if we would be interested for this for this particular um for that problem with your solution but i am like really needing to start tracking more statistics like binary size, bloaty output, like, you know, ABI diffs,

Starting point is 00:49:29 that kind of thing. Can I use the Python interface to push those things in a meaningful way to, okay. So I think it was two weeks ago when you mentioned that on a broadcast. And it's, it's interesting that it was the first time that I was thinking of this, you know, being a possible use case. And therefore, I was interested enough to try it with a new client library for Python. So I'm calling the, you know, ABI diff, and then getting the

Starting point is 00:50:00 output and submitting it to a platform. It's very simple. So by the time that this podcast is released, I'm hopeful that I can have a, you know, GitHub repository that is public that shows, you know, how to do this. So that, you know, anyone who is interested can try. Awesome. Okay, well, where should listeners go if they want to go and try out Tuca, Peshman? I would recommend that they start with tuca.io. There are links there, especially in the documentation, to GitHub repository for the C++ client library, if they want to start integrating that and then getting started. And there are so many different getting started guides in terms of, you know, just if they are curious to get more information about how, you know, the platform works.

Starting point is 00:50:54 Okay, anything else you wanted to tell our listeners about before we let you go, Peshman? Well, I just wanted to mention that, you know, this project started with the C++ client library. So it's near and dear to my heart in a way. And I feel like the best I can get out of the C++ community is their advice and feedback. So I'd like for your listeners to check out the product and then see if they can find it useful or not, and then share their thoughts

Starting point is 00:51:25 with me as I'm starting this journey. Another thing I wanted to mention is for the past three, four years that I've been listening to this podcast, I've been thinking that I would come, if I would ever join you, I would plug the work that you are doing at my former employer now, which is certainly interesting, technically very challenging. And for a C++ engineer, it's the best type of work that I've ever experienced. So I'm sure that we are still hiring. And if you're interested, looking for a fun challenge, check out Vital Images. I don't think we've ever talked about medical imaging on the show before, but I'm sure there's a lot of C++. I turned down a job that was related to medical imaging

Starting point is 00:52:13 because it made me too nervous. What if I got something wrong and I didn't want to kill someone? Basically. Yes, but on the flip side, writing good code might actually save someone's life. Yes, that's what my wife tells me. Thank you so much, Benjamin. Thank you again.

Starting point is 00:52:31 Awesome, thanks for coming on. Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, we'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help support the show

Starting point is 00:53:01 through Patreon. If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode is provided by podcastthemes.com.

CppCast - Regression Testing with Touca

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.