CppCast - Mutation Testing with Mull

Episode Date: May 9, 2019

Rob and Jason are joined by Alex Denisov to discuss the Mutation Testing in general and the clang based Mull project. Alex is a Software Engineer who is working at PTScientists GmbH, a German ...aerospace startup that is planning to land a spacecraft on the Moon. After work, he is organizing LLVM Social in Berlin and researching the topic of mutation testing. He is generally interested in developer tools, low-level development, and software hardening. News Converting from Boost to std::filesystem Kate Gregory ACCU trip report GCC 9.1 Released Alex Denisov @1101_debian Alex Denisov's GitHub Alex Denisov's Blog Links Mull Project Awesome Mutation testing 2019 EuroLLVM Developers' Meeting: A. Denisov "Building an LLVM-based tool: lessons learned" Sponsors Backtrace Announcing Visual Studio Extension - Integrated Crash Reporting in 5 Minutes Hosts @robwirving @lefticus

Transcript
Discussion (0)
Starting point is 00:00:00 Episode 198 of CppCast with guest Alex Denosov recorded May 9th, 2019. This episode of CppCast is sponsored by Backtrace, the only cross-platform crash reporting solution that automates the manual effort out of debugging. Get the context you need to resolve crashes in one interface for Linux, Windows, mobile, and gaming platforms. Check out their new Visual Studio extension for C++ and claim your free trial at backtrace.io slash cppcast. And by JetBrains, maker of intelligent development tools to simplify your challenging tasks and automate the routine ones. JetBrains is offering a 25% discount for an individual license on the C++ tool of your choice. CLion, ReSharper, C++, or AppCode. Use the coupon code JetBrains for CppCast during checkout at jetbrains.com. In this episode, we discuss std file system and GCC updates. Then we talk to Alex Denisov.
Starting point is 00:01:17 Alex talks to us about mutation testing and the Mull project. Welcome to episode 198 of CppCast, the first podcast for C++ developers by C++ developers. I'm your host, Rob Bervering, joined by my co-host, Jason Turner. Jason, how's it going today? I'm all right, Rob. I'm excited and anxious to get going on Core C++. This will air right around the time the conference is starting. I mean, it's, what, Friday and the conference starts on Monday or whatever, but yeah. You'll basically be getting on a plane right after this airs, right? Yeah, pretty much, yeah.
Starting point is 00:02:08 Awesome. And giving my workshop there. And I just realized, I don't think... Have I mentioned that I am officially giving a workshop at CPPCon? I think so. Okay, so it's a one-day class on Constexpr. And I have not mentioned yet that i am now officially also giving a workshop at ndc tech town in kungsberg norway i think we mentioned that you were going to that conference
Starting point is 00:02:33 a week or two ago but i don't think you mentioned having a class there yeah my talk and my workshop are both officially announced for that as well awesome you have a very busy year this year, don't you? I do. Yes. And most of my busyness are things that we don't ever actually talk about. I mean, there's, there's a few conferences, but as far as like training and stuff, we don't generally talk about the specifics of that or anything, but yes, I've also had a good year with training so far awesome okay well at the top of our episode i'd like to read a piece of feedback um we've gotten a couple tweets uh with suggestions for our 200th episode uh we got this one here from uh michelle saying i believe a special guest like herb sutter would be an excellent way to celebrate the 200th episode. And I think Herb actually saw this tweet himself and reached out to us. So it looks like we will be having Herb Sutter for our 200th episode in a few weeks.
Starting point is 00:03:34 Yeah, this would be fun again. Yeah, it'll be great to talk to him again. And yeah, I'll just put out there if any listeners have any specific questions they would want us to ask Herb, please let us know. I think we're going to be focusing on some of his proposals about the future of C++ and trying to make it easier to use, right? Yeah. When did we last have him on? Was it before the 100th episode? It was a long time ago. I think so. I think it was before the 100th and it was right after one of the iso meetings i think where c++ 17 was finalized you're right i think it was the c++ 17 finals oh that was it was right after uh the one that was up uh in the you know daylight sun they were like uh midnight sun or whatever you know like the sun wasn't setting, whichever meeting. Yeah.
Starting point is 00:04:26 Well, it'll be great to talk to her again. Um, and we'd love to hear your thoughts about the show. You can always reach out to us on Facebook, Twitter, or email us at feedback at cbcast.com. And don't forget to leave us review on iTunes or subscribe to us on YouTube.
Starting point is 00:04:38 Joining us today is Alex Denisov. Alex is a software engineer who is working at PT scientists, a German aerospace startup that is planning to land a spacecraft on the moon. After work, he's organizing LLVM Social in Berlin and researching the topic of mutation testing. He's generally interested in developer tools, low-level development, and software hardening. Alex, welcome to the show. Thanks. It's a pleasure to meet you guys. What are the plans for
Starting point is 00:05:05 the spacecraft and will you get to be on board of it um well no fortunately or unfortunately depending on the state of the software but uh generally the idea uh yeah basically the idea is to land a lander on the moon and bring some payload including two two rowers that is also built uh in-house at the company so this is an unmanned spacecraft that you're working on yeah yeah yeah and crew yeah basically is it uh research is it the are you going to try to bring anything back or anything like that uh no it's one way so far it's uh uh it's well it's not research in terms of like academia or something, but it's just a private company. So basically the idea is to be a kind of provider for payload. So what?
Starting point is 00:05:55 Earth, Earth, Moon. I think every Moon shot so far has been government financed. I'm pretty sure. And you're saying this is a private venture? Yes. Recently, actually, there was one from Israeli team. been government financed i'm pretty sure and you're saying this is a private venture uh yes recently actually there was one from uh israeli team they uh i think it was partial success so they reached the moon but it crashed right before landing uh so and they they were the actually i think the first private company who did that. But yes, you're right. Before that, all the companies, all the shots were government-based.
Starting point is 00:06:28 And now I'm curious, what is your actual involvement in this? I am a software engineer there, and my responsibility is the quality of the onboard software for Rover and Lander. So I need basically to find ways to make sure that the thing does what it's supposed to do and considering that there is some track record of when we try to land things on other planets they may or may not crash instead of actually landing there's absolutely no pressure on you whatsoever for the quality of software, right? Yeah, sure, sure. Yeah. Okay.
Starting point is 00:07:09 Well, Alex, we've got a couple news articles to discuss. Feel free to comment on any of these, and then we'll start talking more about your experience with unit testing and mutation testing, okay? Yeah, sure. Okay, so this first one, we have a post from Bartek's blog, and it is about converting from the Boost file system library to the new std file system that's in C++ 17. And I thought that was a pretty good article.
Starting point is 00:07:33 You know, he points out how a lot of code might just be able to, you know, just work by changing the namespace. But there are a couple caveats that he found, where certain things are just missing from the std file system implementation, or a couple things that work a little bit differently. And I think it's worth pointing out, this is a guest post also Scott furry. Oh, I didn't realize that. Okay. Yeah. Yeah. So what did you think about some of the caveats and differences between the file system libraries, Jason?
Starting point is 00:08:05 I've experienced some of this, having to deal with trying to extract a boost where possible and moving to standard file system where possible. So I'm like, yep, good idea to write an article on it for sure. The one thing that really stood out to me is apparently the std file system path does not have a reverse iterator i'm really curious about the design decision for that because all the other container like things have both forward and reverse iteration and a reverse iterator would be to like go up the tree and get like your parent directory right wait i thought no this is just this is just the path okay so the normal path iterator just lets you iterate over each element of the path, each thing between slashes, basically, right? Okay.
Starting point is 00:08:49 A reverse iterator should just do that backward, and every standard container has rbgen and rend. And apparently this doesn't. I don't know. That's weird. Yeah, I wonder why they chose not to include that. I mean, because the std file system is definitely based on boost file system originally, right? Yes, it is. But I know a lot of little tiny details came out in review for coming into the standard. Alex, do you have any thoughts on this one? I'm just curious,
Starting point is 00:09:18 what could go wrong if you do the reverse iteration? That could be the limitation. Maybe some, i don't know consideration was dot dot or right there must be there must be a reason like some some people just could not agree on what it should do so probably yeah yeah okay uh next article we have uh kate gregory wrote an excellent trip report for her trip to ACCU where she gave the closing keynote. And she highlights a couple other talks and keynotes, including one from Herb Sutter that I definitely should go and watch. Another one from Kevlin Henney, who we've had on the show before.
Starting point is 00:10:00 And a couple other speakers I haven't heard of, like Angela Sass giving a talk about security. So it definitely sounds like it was a good conference. It does, yeah. Did you make it to this one, Alex? To the conference? Yeah, to ACCU by any chance. I mean, that's closer to where you are. No, unfortunately.
Starting point is 00:10:18 I was busy with the URLLVM recently, so missed other conferences this time. Okay. I think Kate does have some good tips in here that in general she's talking about her own survival from cancer and dealing with fatigue when traveling and that kind of thing. But I think it's some good notes
Starting point is 00:10:38 just in general for people who are traveling for these conferences to consider. She says, yeah, a travel day is a travel day. She doesn't try to work. She just tries to relax and listen to music and take it as a travel day. So she doesn't start the trip worn out basically. And that's probably good advice for most people going to these conferences. Yeah. I mean, Kate's obviously gone through a lot and it's great that she's able to still attend these conferences and give, uh keynotes. But yeah, I mean,
Starting point is 00:11:05 that's good advice for everyone because conference burnout is definitely a thing. Oh, it is a thing. And since we're talking, you know, CBP con, it's going to be four months now, the tickets are for sale, everything's up. That's sorry, I am definitely thinking four months out at the moment with my travel plans and such. It's hard to believe it's already four months away. Yeah. And a five-day conference, it's very long. Don't underestimate just how long it is if you come. Okay.
Starting point is 00:11:35 And then the last thing we have is the GCC 9.1 release. And the big news with this one is that C++ 17 is no longer marked experimental am i the only one that found that news slightly disturbing that's taken uh almost three years yeah no no no the fact that we've all been using it and not realized that it was experimental that's the part that bothers me that's a good point it's a good point. Anything else worth highlighting here, though, Jason? Well, I did go and look, and I saw that GCC 9 does actually have official support for file system parallelization, which I didn't
Starting point is 00:12:13 realize, so that's something. And I'm sorry, did I interrupt you, Alex? I don't know, I just wanted to mention that I learned a nice thing about it, the info opt, the opt info flag that shows the optimizations, which optimizations were made and which optimizations were missed. So I'm more of a
Starting point is 00:12:35 LLVM client guy and I didn't know about this feature. So it's really I'm really curious now to compile my code with GCC and see what's going on there. I will also definitely have to check that out. What was the flag name again? Dash foptinfo or finfoopt. Okay. Something like that.
Starting point is 00:12:57 Yeah, I think it's marked in the release notes. Yeah, it's dash fopt-info. Okay. All right. I need to definitely check that out. Thank you very much. info. Okay. All right. I need to definitely check that out. Thank you very much. Yeah. Okay.
Starting point is 00:13:10 So Alex, when you reached out to us about being on the show, you mentioned that you work on a project called Mull. Could we start off by telling us a little bit about that? Yeah. So briefly, it's a tool for mutation testing that is aiming C and C++ languages mostly. And yeah, if you're curious about what is mutation testing, that is aiming C and C++ languages mostly. And yeah, if you're curious about what is mutation testing, the official description is that it's an approach to help you identify weak spots in the test suite.
Starting point is 00:13:39 But I think it's just much more than that. So it basically helps you find some gaps in coverage, in semantic coverage. So I'll try to give an example. So the premise of modulation testing is that code coverage doesn't make much sense because you can have a test that provides you 100% code coverage, but the test doesn't have any assertion in it. So basically the test just cannot fail, but you will still get like 100% coverage and you will get kind of misleading feeling of correctness of the program that you have some
Starting point is 00:14:20 quality or yeah, that it works somehow good. But mutation testing, it actually helps you to find spots in the code that are not well tested, basically. Okay, so what does it actually do? What are you mutating? So, yeah, the subject of mutation is the program itself. Just to, yeah, an example, let's say you have a function, yeah, function sum that takes A and B, two parameters, and returns the A plus B, right?
Starting point is 00:14:57 Okay. And you write a test that, like, asserts that sum 5 and 10 is more than 0. Okay. So mathematically speaking, it's absolutely correct test. And you will get 100% code coverage. So what mutation testing will do, it will go to the sum function and will try to replace plus with something else, like plus with minus or plus with multiplication,
Starting point is 00:15:24 or maybe just remove everything and just return zero or crash and the test should actually catch that the test should fail after the change so and in this example if you have uh yeah assert some 5 10 more than zero if we replace plus with multiplication you will still get the test will still pass. So it means that there is something fishy, either in the implementation or in the test. So if it mutates the test, and the test still passes, something is probably wrong. Yes, not necessarily test either. Well, it, the goal is to mutate the program that is being tested. But it could also mutate the test itself.
Starting point is 00:16:10 So basically it just mutates the program. So swapping out a parameter is just a hypothetical. It's just whatever thing gets mutated. Yeah, that could be anything you can imagine. And if we look at some academic papers, there are lots of suggestions with so-called mutation operators, and there are really lots of them replacing while loop with just one iteration instead of n iterations,
Starting point is 00:16:40 or flipping if-else branches, or even going as far as changing the hierarchy of inheritance in object-oriented programming languages. That's interesting. So the goal is to kind of replace your actual code with incorrect code or garbage in some instances and make sure that your tests are catching the erroneous behavior coming out of it, right? Yeah, yeah. I think the original hypothesis was that developers write almost correct code,
Starting point is 00:17:11 so they make small mistakes, and the test suite is good if it can catch those small mistakes. So that's why mutation testing introduces those small changes. Is this a fairly new concept, mutation testing? Because we've been talking about unit testing and improving those for a couple of years now, but I've never heard of mutation testing. I think it's 71st.
Starting point is 00:17:36 It was first... Yeah, it appeared earlier than fuzzing, for example. Oh, wow. So it's almost like 50 years on the market, basically. But it's not that popular because, first of all, you should have tests in your project
Starting point is 00:17:54 to run mutation testing. Right, you need to test first. Yeah, not every project has them. And there was another problem with the performance because a program, even a because program, yeah, with even a small program, you can have multiple mutants. So just as an example, with LLVM,
Starting point is 00:18:14 we took just one part of the test suite, like 500 tests, and it produced 10,000, well, 11-something thousand mutations. And for that, you should evaluate all of them, which takes time. So when you say produced a mutation, I'm going to see if I can get this terminology correct, you mean a mutation that survived, basically, one that did not alter the behavior of the program? Well, not produced, I mean that, so you find a place in the program that can be mutated.
Starting point is 00:18:46 Without changing the behavior. Yeah. Okay. Yeah. According to the unit test, at least. The unit test that we're passing with the mutation passed without it, right? Yeah. Well, if you're talking about survived or not survived mutations, then yes.
Starting point is 00:19:01 Whenever you change something and test doesn't fail, it means that this mutation survived. It's kind of terminology. I definitely feel like I'm having a little bit of a hard time understanding this, but if it produces a mutation and the test does fail, do I still need to investigate that mutation? No. Okay.
Starting point is 00:19:27 So it's considered as non-survived or killed and if it doesn't fail uh then you probably should investigate that so you ran it with llv you said against llvm right uh yeah just as a um test like um just as an example. Right. Yeah, yeah. And that found 10,000 things to investigate, which is a lot. No, no, no. That's, yeah, sorry, I made you confused. So it found, it just created 10,000 mutants. Okay.
Starting point is 00:19:59 But not all of them survived. Ah, okay, okay. So, yeah, that percentage was much much lower like that's good well i mean it's it's it was still still huge like 40 percent of them survived uh but yeah not still not all of them needs to be considered and yeah not yeah yeah yeah you're not trying to say that lvm is full of a bunch of giant bugs. You're just saying this is what we investigated and found some high-level things possible. Yeah, yeah, basically. So in this case, LLVM was just baseline,
Starting point is 00:20:34 because if our tool is sufficient enough to work against LLVM, then it's probably sufficient enough to work against other projects. Right, LLVM. LLVM, like, one million lines of code and quite big, LLTMs. LLTMs have like 1 million lines of code and quite big, quite big. Okay, so the goal of mutation testing isn't necessarily to find new bugs, but just to help you create new unit tests
Starting point is 00:20:55 to better cover your code with unit tests, right? I would say both. It is both, okay. Yeah, yeah. I think, again, if you look at academia then they mostly treat it as a way to improve your test suit but in my opinion and my practice uh yeah i just have have evidence that it also helps to find bugs in the software so are you actually mutating the source code or the binary? Something in between.
Starting point is 00:21:26 We are mutating bitcode, LLVM bitcode, which is not source code and not yet binary. Okay, so let's say I take mall and I run it against my project, and it says here is a mutation that I did to the bitcode, and this mutant survived now what what do i do with that information that i've been given how do i relate that back to the source code of my project so for that we relay rely on debug information okay yeah otherwise it doesn't make sense to i mean we cannot point you to place in the bit code because it doesn't make any sense. So currently, I think we have several reporters implemented. One of them produces
Starting point is 00:22:11 information in the form of warnings, like Clang or GCC does, so that it can be plugged in into IDE and you just run mall against your project, and it just spits out a bunch of warnings saying that, okay, pointing to this column in that line in the file, that, okay, here we replaced plus with minus, and nothing happened. And then your goal is to actually get the test that was run against this mutation
Starting point is 00:22:38 and think about how this test can be enhanced to actually kill this mutation. Okay. So I'm thinking if it... Now I'm going to use hand signals, so only people watching us on YouTube can actually see this. If it mutates some bit over here, but the entire test suite is only running in this part of the binary.
Starting point is 00:23:07 It seems like I should not get a report on that. I only get a report if it modifies something that actually executed at that time. Yes. Okay. And for that, well, it's actually great optimization that many robots, like many mature tools use, they use, and we do this as well, instrumentation. So basically, if the code was executed, then we mutate it. If it wasn't executed, then not. It
Starting point is 00:23:34 helps a lot. With that example of LLVM, if we just apply all possible mutations we have on the LLVM test suite, then we got like 60,000 of mutants. But with just instrumentation, it was like 10,000. So it significantly decreases also runtime.
Starting point is 00:23:56 I have this morbid curiosity. I want to run it against my latest projects, but I really don't want to run it against my projects. I don't want to run it against my projects like i don't want to know what it would find well it may find some um helpful things um i think so yeah the project was started basically as kind of research uh because i was curious if it makes sense this i mean the idea sounds interesting but uh it wasn't clear if it actually helped somehow. And I think the first kind of bug was found in a real project, and I was literally dancing around my apartment.
Starting point is 00:24:34 And it was a really good one. I'll try to explain what was happening. So it was a real-time operating system that, yeah, it's proprietary. Well, it's open source, but it's kind of also proprietary part. It has proprietary parts. So this operating system, it has some tests for mathematics, and there were some work tests for metrics, for, yeah, matrices. And that test was doing something like multiplying several matrices and then comparing the results that some like particular column row some particular value
Starting point is 00:25:14 is equal to some uh like specific constant okay and there was in that, like 100% code coverage. But when we started mutating it, the mutation coverage was very low. And I just could not understand. I was looking at the code like, okay, it looks legit and it's executed. It all works, but what's going on? And it turned out that the test used a function called compareDouble, a custom function that takes two doubles and then compares them. Basically, it takes the absolute difference between them and checks if it's smaller than a small value like absolute. Right. But it turned out that the function signature was taken to booleans for some reason uh typo or i don't know but it was like compare
Starting point is 00:26:07 double but boolean like pool a pool b and it was always giving uh true except of the case when one of the elements is zero wow when when i changed i mean and it's legit code, right? You can do that. So I think Clang was, if you enable all warnings, it gives a warning, but GCC, yeah, doesn't give, doesn't tell anything. So when I changed actually this double, tool to doubles, then I found that like really huge number of tests was failing. So yeah, the developers who rolled the test, they got probably wrong assumptions about what should should
Starting point is 00:26:45 be there and uh yeah in the test the i j the column and rows in the metrics they were swapped basically and there was yeah quite quite big test suit so that was uh interesting discovery um yeah uh another example that i like a lot also in this same operating system, there was a test suite for a file system. It's basically some interface. You have class file with four methods, like open, close, and read, write, basically. And what happened that was mutation. We just removed basically, well, almost everything from the open function but
Starting point is 00:27:26 the tests were still passing and i started looking into it and i found that it's possible to get to the state in the program when you don't open the file but you start reading or writing and if you read from file that is not open then it will then it will want to terminate the buffer that you're reading to, and it was just writing 0 to minus first index. Wow. Basically, why I'm giving this example, because I think it's quite hard to find such bugs with some other tools. With static analysis, well, maybe, maybe not. Not entirely sure. That's very interesting.
Starting point is 00:28:16 Are you running it on your Lunar Lander code that you're currently working with? So, yeah, it's on our plan to edit. So it needs integration into CI pipeline and so on, but it's definitely planned. How much effort do you think that would be? For the sake of our listeners, if they want to draw this
Starting point is 00:28:33 into their CI, how difficult would that be? If the project can be compiled with Clang, then it should be really straightforward. You just add two more flags or one more flag to the build system and then you run mall against
Starting point is 00:28:50 the tool. So that should just work. And yeah, there are also some pre-built packages for Linux and Mac, free BSD as well. So yeah, it should be straightforward. It probably would require some tuning afterwards,
Starting point is 00:29:07 but as the first step, it's just almost one liner. I want to interrupt the discussion for just a moment to bring you a word from our sponsors. Backtrace is the only cross platform crash and exception reporting solution that automates all the manual work needed to capture, symbolicate, dedupe, classify, prioritize, and investigate crashes in one interface. Backtrace customers reduced engineering team time spent on figuring out what crashed, why, and whether it even matters by half or more. At the time of error, Backtrace jumps into action, capturing detailed dumps of app environmental state. It then analyzes process memory and executable code to classify errors and highlight important signals such as heap corruption,
Starting point is 00:29:44 malware, and much more. Whether you work on Linux, Windows, mobile, or gaming platforms, Backtrace can take pain out of crash handling. Check out their new Visual Studio extension for C++ developers. Companies like Fastly, Amazon, and Comcast use Backtrace to improve software stability. It's free to try, minutes to set up, with no commitment necessary. Check them out at backtrace.io. What kind of runtime would this take for it to introduce mutations and then rerun the unit tests against it? Like, I could imagine it, you know, creating just endless numbers of mutations and rerunning the unit tests over and over again.
Starting point is 00:30:20 Yeah. So it depends on the size of the project obviously but i think we did this with open ssl some time ago and it was just one minute three minutes maybe depending on the test suit okay for llvm for one test suit that was about three hours two two, three hours, something. And well, it's not the fastest machine, so I was just running on my whole machine. So it definitely, again, depending on the size of the project, it either can be included into normal CI flow
Starting point is 00:30:57 or as part of nightly builds. So, I mean, maybe to expand on Rob's question is, how does it know when it's done well it's pretty deterministic because you have number of instructions and a lot of instructions bitcode basically and you have number of mutation operators the rules how to mutate them okay so we basically go to each instruction and check, okay, can we apply this mutation to this instruction? And so on and so forth. Okay.
Starting point is 00:31:29 Basically, if you rerun it 10 times, the result is always the same because it changes the same parts. It's not like fuzz testing that could effectively run indefinitely. Well, not this implementation, at least. So currently, it just introduces one mutation per test run. But I mean, technically, it's possible to introduce more. And then you have more combinations, of course, and it just explodes.
Starting point is 00:31:59 But yeah, it's just deterministic, and you know what would happen. Okay. Now, since I just mentioned fuzz testing, how do you see its relationship to fuzz testing? These are complementary things that should both be run? Yeah, I think they are orthogonal. So mutation testing, the subject of mutation testing is the program itself,
Starting point is 00:32:22 and the subject of fasing is input okay so you've been yeah complementary probably is the the right word in this case that makes just cover everything okay uh do you think everyone should be doing this i mean should mutation testing be like the next big thing should we all be adding it to our projects absolutely but, of course, it depends. It definitely was a try, because after I started using it, I changed also my approach to write tests. Because when I write something, I think, okay, can mutation testing help me detect this or not? And some clever, just becomes more straightforward because of that. Yeah, I think talking about next big thing, I hope eventually we will get rid of code coverage in general and we'll just stick to the mutation coverage
Starting point is 00:33:17 because it basically subsumes it. So mutation coverage, I mean, code coverage is a subset of mutation coverage. That sounds like a pretty strong statement, really, because I know some people are very serious about their code coverage. Well, but as I said, it doesn't prove anything. It just proves that the line of code was executed that's it right but yeah it doesn't it doesn't say anything about the semantics of the program it's very interesting i mean don't you need like a combination of the two though because if you have very low code coverage mutation testing
Starting point is 00:33:57 isn't going to find as many you know possible surviving mutations for you. Yeah, yeah, that's right. I think that code coverage makes sense only if it's zero or less than 10%. If it's more, then you should use some other techniques to actually check what it does. So you said that using mutation now is kind of driving, it's kind of changing your behavior for for how you write test and i know some people are like very like high level like we only do behavior driven test because we want to know the behavior of the program to feel free to change the implementation and some people are very strict about doing unit tests on every function and do either of those things i mean like like come into your thought process when building tests now i I think, well, these are just different levels.
Starting point is 00:34:46 And you can also do mutation testing on the level of integration tests. Of course, depending on what kind of test suite you have and what kind of integration you're testing, if you have a web server running PHP and your program running C++,
Starting point is 00:35:02 then it's probably hard to mutate all the things. But if it's just a bunch of subsystems written in C++, then it's probably hard to mutate all the things. But if it's just a bunch of subsystems written in C++, for example, then it's doable. Wow. Do you think... Are you aware of other mutation testing
Starting point is 00:35:18 products that we should be aware of? Because you said MOL is limited to Clang. Is there anything available that you're aware of for MSVC or GCC? So there are a bunch of other tools that also target C and C++, but they just work differently and they are agnostic to quite a while. Yeah. So I think we are tied to Clang and to LLVM
Starting point is 00:35:42 because we just use a lot of LLVM facilities to do the things because of LLVM bitcode and again one of the reasons is to get better performance because other tools yeah I was actually reading a paper once and I think I was on a plane and the paper showed some
Starting point is 00:36:00 numbers that blah blah blah our tool did this and it shows like analysis took, I know, 60 and actual like execution took 72. And all the numbers were in seconds. And then I look at it like, okay, their execution time took 72 seconds and our tool takes like four hours for a similar thing. And I was really upset.
Starting point is 00:36:29 But then there was a short note behind the table saying that the execution time is in hours. Oh. Yeah. Okay. So it's about performance in our case. Are these other tools doing like source code manipulation instead of bit code manipulation?
Starting point is 00:36:49 Yeah, there are several tools. I will send you a link and you can attach that. There is also mutation testing on GitHub with a bunch of links to tools for different languages. Yeah, there is one tool that is nice called Universal Mutator. It just uses regular expressions to mutate whatever source code you give. So that should work for any platform, any language, basically. Now, how mature is Malt itself?
Starting point is 00:37:19 Is it ready for production use for our listeners? I think so, yes. It is three years old. I don't know if it means something or not, but recently I released the first version. And yeah, I did that recently because I felt that, okay, now it's ready. So you speak at LLVM conferences
Starting point is 00:37:41 and you have your own LLVM user group, you said. Yeah. How is it being received by the other people in the LLVM community? Well, there are different opinions. Some people are interested and they're curious. And I think in general, if you're talking about developers, not that many people heard about mutation testing. Okay. So, yeah, most of the people are interested and curious about the approach and how that works, what it can bring.
Starting point is 00:38:10 But yeah, some people, as usually they say, my tests are good. I don't need that. I feel like it's impossible to say that your tests are good enough. Yeah. That's kind of the point that I've reached after having gone down these roads many times and then feel like, well, now I run fuzz testing against my project and I discover, you know, an entire class of bugs I didn't even know is possible. Right.
Starting point is 00:38:38 And I can only imagine mutation testing will again, find that kind of thing for me. Well, which is probably good. Right. Did you, you say it's specifically targeted for C and C++, but since it does manipulate LLV and bitcode, could our Rust listeners use it, for example? So, yes, I think the original idea was actually to build it like language agnostic. Right. But it requires some different approach to Rust and C++
Starting point is 00:39:08 because for C++, we can just mutate one instruction, like replace add with sub. But with Rust, for example, we need to mutate four instructions because there goes the instruction itself, and then there is a sanity check that there is no overflow for example or underflow so it's just a different approach and we had a prototype that worked for Rust
Starting point is 00:39:32 and we had prototype we still have a prototype that works for Objective-C on Mac and we had a prototype that worked for Swift on Linux so but yeah Mac. And we had a prototype that worked for Swift on Linux. But yeah,
Starting point is 00:39:50 we don't have that many resources. Basically, we decided to target only C and C++. Especially because this is the language that we are used to more. Makes sense. You also have another project called System Under Test. Is that right? Well, yes. Technically, I do. Is that right? Well, yes, technically I do.
Starting point is 00:40:06 But it wasn't updated for a long, long time. Oh, okay. The idea was to learn how different systems are being tested, like LLVM, Postgres, FreeBSD, whatnot, because they all have different approaches. But, yeah, don't have that much time to do that anymore. Any conference talks coming up soon? Well, just from me, you mean?
Starting point is 00:40:36 Yes. It looks like you've spoken at several, right? Yeah, that's right. There is the recent one from your LLVM where I'm talking about the mall. Well, not about the mall itself, but about the challenges and some caveats that we faced while building this tool.
Starting point is 00:40:54 Okay. Yeah, so if any of the listeners want to build some tool based on LLVM but doesn't want where to start, then yeah, I highly you to watch my talk. There are some nice advice. I can imagine there will be listeners who would want to watch that talk. What was the title of it?
Starting point is 00:41:15 Building an LLVM-based tool lessons learned. Okay, very good. I can also share a link. Yeah, any particular lessons you wanted to share with us on the podcast about building an LLVM tool like MOL? Well, just another bold statement, don't use LLVM passes. Okay. So whatever it means, yeah.
Starting point is 00:41:41 Well, any chance that MOL will end up in the status of a sanitizer built into our tools right out of the box? I don't think so. Okay. Yeah, honestly, there was, yeah, there were some changes which would be needed to the whole stack of the programs, I think. Right. Very good. Okay. Well, it's been great having you on the show today, Alex.
Starting point is 00:42:06 Thanks for coming on. Yeah. That was a great pleasure to chat. Thank you very much. I'm nervous. Why? Well, I have some stuff
Starting point is 00:42:16 I have to try now. So thank you for expanding my mind for what might be possible with our testing. Yeah. Okay. Thanks. Thanks so much for listening in as we chat about C++.
Starting point is 00:42:28 We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, we'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help support the show through Patreon.
Starting point is 00:42:54 If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode was provided by podcastthemes.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.