CppCast - Unit Testing

Starting point is 00:00:00 Thank you. Now, get ahead of bugs by using LLVM static analyzers and CBP core guidelines checks from the comfort of your IDE. Start today at clangpowertools.com. In this episode, we discuss ISO papers and a new build time analysis tool for Microsoft. Then we talk to Oleg Rebave. Oleg talks to us about unit testing and the importance of making your code testable. Welcome to episode 263 of CppCast, the first podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today?

Starting point is 00:01:32 I'm all right, Rob. How are you doing? Doing good. I wanted to ask more about this announcement I saw on Twitter right after recording last week. Is that something you wanted to talk more about? Well, I mean, my announcement that i'm working on a book was preempted was was so it's it's out there for pre-release at the moment i'm still working on it um but uh now i'll let's let's talk about it more after i actually officially finish the book but listeners who are curious can go and find it it's out there well can you at least say just how

Starting point is 00:02:05 far along is it so far oh so i would say probably about 85 90 done with a book on c++ best practices and it's supposed to be um just like simple actionable items is my goal so each section is intentionally very short and and simple okay But so as you're saying there, you hadn't planned on officially announcing yet, but it kind of got out there a little bit last week. Well, I hit the, I hit the make this book available. And then because I had talked about it on Twitter that I was going to do it. And then I said, let people register to be notified when it was available for presale. And then I made it available for presale. And then I said, let people register to be notified when it was available for pre-sale. And then I made it available for pre-sale. And then people started talking about that on Twitter.

Starting point is 00:02:48 But I was still just going to try to keep it quiet for a little bit longer until the book was done. I apologize then. That's fine. We'll talk more about it when it's finished then. Okay. Right. All right. Well, at the top of the episode, I'd like to read a piece of feedback. This was from a few weeks ago when we had Slobodan on talking about his book,

Starting point is 00:03:11 Modern C++ for Absolute Beginners. And this commenter on Reddit says, Stop calling it modern. It's been modern for almost a decade now. I get that C++ is a slow-moving glacier, but the qualifier adds no useful information and just makes results more difficult to search for in the future. Idioms change in post-C++26. I really don't want to continue searching for modern C++

Starting point is 00:03:32 only to find pre-C++20 material. I think it's probably been more than a decade. The modern C++, that term came about way before C++11, right? It did, but the way we all use it right now, I think was because of Scott Myers' modern C++ book, Effective Modern C++, which was 2012 that it was actually released, I think. I have to double check that.

Starting point is 00:04:03 So yeah, I'm not really sure if you're ever going to be able to get people to stop using that term. I don't know. What do you think, Jason? We could call it modern C++20. That's true. No, I don't know because there's a problem because you can be using C++20, but still be using C++98 techniques when there's better idioms and better options available. So how do you qualify that you're using the latest and greatest best practices and idioms that are available? I was just running through some of these things in my book about concepts and how much they can make the code better. But most people

Starting point is 00:04:47 don't have access to concepts yet. So I don't know. How do we call that when it's time to say, yes, modern C++ now includes concepts? I don't know. I don't know either. Okay. Well, we'd love to hear your thoughts about the show. You can always reach out to us on Facebook, Twitter, or email us at feedback at cbcast.com. don't forget to leave us a review on itunes or subscribe on youtube joining us today is olig rubev olig is a c++ developer who cares deeply about proper software design clean code and testing experience in building large financial or large-scale financial trading and risk management systems in-memmemory databases, and reusable libraries. Oleg, welcome to the show.

Starting point is 00:05:28 Thank you. How are you doing, guys? Good. I think we have to congratulate you on possibly the shortest bio that Rob has ever had to read. All right, then. I appreciate the to-the-point terseness here about the things that you really care about as a developer. I like things simple and short. This was not always the case, but as you evolve through your career, you learn to do things simple. Simplicity is good.

Starting point is 00:05:53 The older you get, the more you ask the question of, how little code can I write to solve this problem? Absolutely. There you go. Oh, my God. Okay. Well, Oleg, we've got a couple news articles to discuss. Feel free to comment on any of these, and we'll start talking more about what you're up to, okay? Okay. So this first one is the 2020-08 ISO mailing is now available.

Starting point is 00:06:18 It's a fairly short one, but I guess they're putting this out every month, so there's only so much that makes it per month. Any highlights for you here jason um not particularly but more questions than than highlights from my perspective is at these these mailings are coming out regularly right and the committees are still meeting remotely yeah but i don't think i've heard of a single paper that was accepted for c++ 23 yet have they been accepted is there some list of these things how do we deal with the fact that they are not having the meetings where people vote on things and it officially moves forward i think correct me if i'm wrong but i thought we talked to someone who said that they're basically you know voting on

Starting point is 00:07:02 these things virtually you know kind of giving a tentative thumbs up thumbs down uh with the plan being that you know these things that we are virtually approving now we want to actually vote in person on when we're able to okay does that sound right maybe it's possible you made it up but it it doesn't sound false. Oleg, do you remember? Do you know what's going on here? No, but I like the fact that they have a proposal for Stacktrace and the transactional memory. That looks very interesting to me personally. Okay, so based on your bio, it does sound like transactional memory would be interesting to you.

Starting point is 00:07:41 Transactional memory is one of these things. It was the hot new topic in the early 90s, and then people stopped caring about it, and there's been proposals in and out of the C++ standard. What are your thoughts on transactional memory? I mean, if you have a transactional memory, they really simplify the multithreaded algorithms, right? Correct.

Starting point is 00:08:03 Multithreaded code is extremely difficult, right? Especially with modern processors, there are some very optimized algorithms like lock-free and wait-free, right? They're very difficult to write. And I think transactional memory would facilitate in that. So basically, the way I look at it is that you make a change to a bunch of memory, and

Starting point is 00:08:28 then someone else will only see the changes when you decommute the transaction. Now, how syntactically and semantically it's going to look like in code, I haven't actually looked at the paper, right? But this is something that's very interesting. All your changes are treated as atomic, right? So just like atomic variables in C++11 and on. So this is something very interesting. I would definitely like to play with it once there is something stable or experimental.

Starting point is 00:08:54 Do you know, can we think of this like database transactions, like begin transaction, do some stuff, and then commit? Absolutely. For example, if you're writing a concurrent binary search tree, concurrent red-black or AVL tree or concurrent hash table, it would be very helpful there, I think. There are some hash tables, for example, implemented using lock-free algorithms, and those are extremely difficult to debug and design. It's like a black magic on steroids, like quite literally. So I think it would really be helpful in that sense. And stack trace, you know, C Sharp, I think Java, they always had it.

Starting point is 00:09:39 It would be nice to have something like that in C++. Not from the core, but actually be able to, you know, something that allows you to look at your stack much easier than it is now. Yeah, it looks like that proposal is based on the Boost backtrace library. Right. I haven't looked. I haven't looked. I just scanned the headings and I like those two. I feel like this is one of these things where, you know, and I'm probably exaggerating here, that it's going to face problems because, you know, like the standard doesn't acknowledge that there actually is a stack or something like that, right? Like, because this is the kind of thing that tends to happen.

Starting point is 00:10:20 Yeah. But it's already on its, wait, where did it go? I just clicked on it. It is in LWG, so it made it past LWG Evolution. Is that what you were going to say? Yeah, oh, and it's on its 6.3 vision. So it's definitely moving forward. So I guess I should be more optimistic than that. Right, right.

Starting point is 00:10:40 Okay, next thing we have is an article from the visual c++ blog and this is introducing vc perf slash time trace for c++ build time analysis uh and we had uh kevin cadeau uh and scion a while ago talking about this feature and uh this is actually a guest post from a game developer uh from mercury steam entertainment i don't think I'm familiar with that game development company. But he used the C++ Build Insights SDK that we talked about with Kevin and actually wound up submitting a new feature to it. And it's to output this graph that you can actually view in Chrome, which is pretty cool. Yeah, that's effectively exactly how Clang's time

Starting point is 00:11:24 trace also works okay and uh i thought it was interesting how they found some you know issues in their own code using this one of the things they mentioned is a particular default constructor which was defined by the compiler that was taking three percent of their total build time. So they just went and put in that constructor themselves and saved 3% on their build. It's amazing. Wow. Yeah.

Starting point is 00:11:53 But sadly, it's only available for Visual Studio, right? Yes, this is for Visual Studio. Yeah. The good old days. Well, Clang does have similar functionality as well i forget what its parameters called now slash analyze perhaps but yeah something else for our listeners to look into if they're interested okay and then jason did you want to share this last one this is a post on the cpp subreddit about yeah i just thought destructors i thought that's interesting yeah because uh the poster here

Starting point is 00:12:26 asked like how come gcc and clang are giving you know one of them's allowing this code to compile and the other one is not and it was using a technique that i recently had on c++ weekly for having multiple destructors in your class with c++ 20's concepts because you can constrain your destructors. So the destructor only exists if it matches the constraints. And this person wanted it to be virtual or not virtual. And I just thought it was fun. This is like, you know, just because it's so easy to ask these questions today. If you manage to formulate your question in a way that people can like read and provide

Starting point is 00:13:04 a Godbolt link to it and whatever. And the top answer on Reddit is a quote from the standard, a virtual function shall not have a trailing requires clause. Like it's just right there, but they got like the perfect answer just by asking the right question as well formed. And it was timely with C plus plus 20. So I liked the article.

Starting point is 00:13:23 I'm not even an article. It's like a three-minute read. That's all. Okay. Okay, so Oleg, so you actually reached out to us, asked about coming on the show after we recently talked about one of those JetBrains state-of-the-community posts. In there, they talked about what unit

Starting point is 00:13:45 tests are C++ developers using unit test libraries. Over a third of respondents to that poll said they don't write unit tests for C++. What are your thoughts on that? Complete and out of shock. Did you

Starting point is 00:14:01 cry after this? I think if George Costanza was a C++ programmer, he would say the worlds are colliding at this point. I mean, I was completely shocked. Yeah, I mean, this is 2020. I can't understand 10 years ago, right? I think the unit testing itself picked up at the beginning of the agile movement. I wrote my first unit test probably in 2008 without even realizing that I'm writing a unit test because I just wanted to write a small program that sort of tests a small piece

Starting point is 00:14:38 of code. And I had like main function where I would have like a switch statement, comment it out, but I had no idea it was a unit test, but actually it was a unit test. And then I was introduced to the Google test framework, right? And for me personally, it was a paradigm shift. And since then, like since about 2010, I've been hooked. So a unit test in general improves the quality of your code. I think someone mentioned if you write a clean code, you don't have to write a unit test, right?

Starting point is 00:15:12 In one of the comments you mentioned on the show. And I'm like, how can you write a clean code without a unit test? So, I'll tell you the real story, right? Go ahead. So, I was working on a project. It's actually in memory database, right? So, I'll describe it in very generic terms. Basically, it's multi-threaded.

Starting point is 00:15:36 It has a SQL compiler, basic data, composite indices, has alter table operations and running database. Replication, you know, the cluster should never hold down. It should be like 24-7 operational stuff like that. Processes tens of thousands of messages a second. Has a lot of lock-three data structures. So it's extremely complex system, right? So we hired a developer into the team who was supposed to help out, sort of support it and extend the functionality.

Starting point is 00:16:10 And in case, you know, just something happens to me personally, I would say, right, someone can take over. Like Jeff Bezos decides to adopt me, right? Is that your retirement plan? No. I'll never retire. I'll always say to my family, I will die coding. So he joins the team, and I expect questions, right?

Starting point is 00:16:42 It's naturally very friendly. He doesn't ask me a question on day one. He doesn't ask me a question on day two, three. And I expect the questions, right? Like, it's naturally very friendly. He doesn't ask me a question on day one. He doesn't ask me a question on day two, three. The week goes by two weeks ago. I am starting to think the guy doesn't like me. Then I turn to him. It's like, why don't you ask me a single question?

Starting point is 00:16:57 It's a pretty complex system. He says, Oleg, I don't need to. Every time I want to find out how a particular component works, I just look at the test, and it's there. I don't need to ask you a single question. That particular system had several thousand units, units, interaction, and integration tests. So that's sort of, this is my philosophy personally, and I try to, you know, preach it. I, you know, I was not always like that.

Starting point is 00:17:22 I wrote a lot of spaghetti code at the beginning of my career, just like everyone else, let's be honest. I assume I sometimes do the same thing, right? And it's not up to me to judge, but I always try to improve my design skills. And I feel that there has to be some kind of methodology that allows you to achieve that easily. For example, we all agree it's much easier to program in C++ than in Assembler, right? Usually, yes. I'm not sure about C++ 26, right? So, like, for example, we have a static typing in C++, right? If you have an integer and you're trying to assign string, std string to an integer variable, the compiler will not let you do it, right?

Starting point is 00:18:05 You can't just move memory like it will prevent you from doing that. Things like that. So it's built into the language where the compiler sort of is guiding you and not allowing you to do certain things, which would be stupid, right? Now, with testing, I treat testing as my guide to a design. Whenever I work on any piece of code, I always think, how am I going to test it? And I don't always think about a particular library or service. I always look at the system as a whole.

Starting point is 00:18:38 How am I going to test it? So there are three simple rules that I try to follow. Number one, write all with small functions, right? There are a lot of fancy jargon out there. What if you want to, let's say, promote some sort of like developer philosophy, you want to make it as simple as possible, right? Just like the bio. So, so, guys, we should all write small functions. The code should be modular. And I think several guests, I remember just recently,

Starting point is 00:19:13 several months ago, it was a guest who was actually pretty funny, and he mentioned about small functions. This is just basic stuff, right? The next thing would be no cycles. So, when I joined Bloomberg, you all know, I'm sure the C++ community know John Lakers, right? Right. He's been on the show, yeah.

Starting point is 00:19:32 Yes, yes. And I was introduced to his, he has a lot of YouTube videos, right? And I watched his stuff, you know, and some of his ideas are revolutionary, right? But one thing that I take away from myself is the hierarchical design and no cycles under any circumstances, with some exceptions. I'll elaborate on that. What does it mean? Like if you have component A and component B depends on component A, they should not be cyclical dependencies. Oh, okay. Or if you have anything like component C depends on component B. Component A should never depend on component C, right?

Starting point is 00:20:07 Because B depends on A. You should not have cyclical dependencies in functions, in classes, in libraries, in services. You should not be, in my opinion, there should not be cycles. And I had code prior to that with cycles like, and once I started following that philosophy, it actually improved the design and testability of the code. So actually, John mentioned, they asked him, I think, in one of the talks that could be on your show, or maybe it's from some of the YouTube videos,

Starting point is 00:20:36 what do you think the biggest benefit of modules? So I think John said, John Liker said, for me, it's almost impossible to create a cyclical dependency with modules, right? Yeah, that might have been his interview on here. It could be, yeah. So no cyclical dependencies. And it's very easy to achieve. Just follow those rules, right? So small functions, when you say small functions, then everything becomes small. You have small classes, smaller libraries, smaller services.

Starting point is 00:21:06 You just break your components into smaller pieces and make them reusable. No cycles. And number three, tests. These are the three rules that I personally follow. Make your code testable and designed for testing, right? So if you do that, and even if you don't read any other books with 1000 definitions right which a person can get very discouraged from by just looking at stuff and feel like there's there is this kind of test there is a approval test right regression unit test there is a

Starting point is 00:21:39 behavior-driven development there is a this this and like, where do I start? Like, I have to know all this in order to be productive. So, I think you should always start with a simple principle and it is like 80-20 rules. By following these principles, you will cover 80% of your design, in my humble opinion. So,

Starting point is 00:22:00 I just give you an example. Your tests become your documentation. They express, you express your code through the test. In fact, if it was up to me, let's say if I wrote, you know, you guys use Compiler Explorer or the WAN box, right? I like WAN box, right? But there is always a main function there, right? To me, it shouldn't be a main function. It should be text feature.

Starting point is 00:22:24 And instead of doing the C out and looking at the end, I would rather use the assert markers or require markers, right? This is my philosophy. This is like if I started on something new and want to test something, I don't write a main function. I will have a test fixture, and that's where I will basically play with my code and stuff like that, right? Using the assert markers or expect or require,

Starting point is 00:22:46 depending on the framework you use, instead of doing the C-outs and stuff like that. Okay, so you mentioned like a ton of terms just now. Yes. So it kind of sounds like you do what most people would call test-driven development. Is that fair? Yes, but not religiously.

Starting point is 00:23:06 I will explain why. I don't write my test before I write my course. So that is typically what test-driven development would be, right? Well, again, it's one of those arguments that a couple of camps, it's like, what's better, VI or Emacs, right?

Starting point is 00:23:21 And some people do it this way, some don't. I don't, personally, because it hinders my design. I mean, it's just more difficult for me like that because then you start losing... I don't want to criticize that. So I don't want to like... But basically, I don't do it.

Starting point is 00:23:36 But I will write a test for every public function of my cloud. Okay, we're C++ program. Let's talk about public functions, right? That's the only thing you can really test. I will write a test, at least one test, for every public function. Okay. There will be more tests because you want to test function in, you know,

Starting point is 00:23:57 fulfill the contract of the function, right, and then test the exceptions and the corner cases, stuff like that. I don't test undefined behavior. That I would not test. Okay. So, it does sound like, you know, when you see that people do things like pound-define private-public to be able to make their class testable.

Starting point is 00:24:22 No, no, no. It's a no-no for me. I just use that personally. Well, if you do stuff like that, then maybe your class is too monolithic and you should break it up into smaller classes. If you have too many private

Starting point is 00:24:36 functions that are in itself provide some testable functionality, maybe you should make them part of another component and use that component as a composition inside the class, right, as a client. Right. But I would never do this. I don't

Starting point is 00:24:52 like to pollute my code with things like that, no. Yeah, so for the sake of our listeners, that is possible technically with most compilers, that you can do that and then all of the private functions in a class become public. And yeah, but it's not a good idea usually. I don't advise it, yeah.

Starting point is 00:25:10 I tried it once like a while back, but no. You tried it. So you said you don't want to get too caught up in terminology, but I do kind of wanted to dig in a little bit. It does sound like you do advocate specifically for what most people would call unit testing, individual functional testing. I advocate way more. So if you want, I will elaborate on a different type of test that I personally write.

Starting point is 00:25:36 Okay. So I think everything is a unit or integration test. Everything else is just the derivative of it, right? If you really think about it, and I'll explain what the unit test is in my humble opinion and the integration test. And there is another thing. Maybe it's a concept that I heard in one of the lecture, I believe, somewhere on YouTube. And I like that concept, so I'll talk about its interaction test. So what is the unit test, right?

Starting point is 00:26:05 Yeah. I've never heard that term before. So now is the chance. What is the unit test, right? So the unit test is basically, it's just a small program with minimal setup code, right? A small program with minimal setup code, which exercises some narrow part of your public function of a class. That's it. Now, what is the interaction test?

Starting point is 00:26:28 Interaction test is a unit test with more complex setup. Now, what does that mean? As you build your component, let's say, for example, if you're testing vector or a string class, right, you will only write unit tests because vector in itself is more like a vocabulary. It doesn't interact with any other components. But as you build your class hierarchy, right, the testing becomes a little bit more complex, correct?

Starting point is 00:26:53 Because you have to do more things just to set up the code in order to test your higher level components. So those kind of tests I call interaction tests. And then there are integration tests. Integration test is either unit test or interaction test with external dependency. That's all it is. Meaning, if you are, if let's say you write database access, and you have function select by parameter key, you have an external dependency on the database, right?

Starting point is 00:27:26 Let's say it's a SQL Server database. So in order to test your code, you have to have an instance of the database running. So you have the dependency, and that makes it a little bit harder to test, right? Integration testing is more hard because what if it's more complex dependence, like you use some kind of message queue mechanism which requires the setup, or you talk to your service over TCP IP connections, right? That's the integration test. Now, the community tends to merge the interaction test integration into one concept.

Starting point is 00:28:00 I split it just so to make interaction test just like integration tests, but without dependencies. And my rule is I always try to achieve unit and interaction tests and remove external dependencies. And there are ways of doing that, but you have to design your system to start with to do that. And I can elaborate on that. For example, in the story that I told you at the beginning about this large-scale in-memory database, one of the things I was able to test without any kind of dependency is communication protocol, replication.

Starting point is 00:28:44 If one goes up, another one goes down, you know, things starting, you know, exchanging information and everything continues working. How do I do that without having any kind of external dependencies on file or TCP or anything like that? Great question, yeah. The way you do it is the following way. There is a technique called, you know,

Starting point is 00:29:04 it's a widely used technique in java i think i like dependency injection right so what you have let's say you have a component a that talks to component b using component c and component c is uh tcpip interface right let's say we're writing a pink server, a ping client. So the ping client code just says, you know, I'm sending something using B to C, right? C is the server, A is the client, B is the middleware. So how do you test that? Okay, you have to start your server, right?

Starting point is 00:29:46 That will listen and bind in the socket, and then the client will open the socket connection, et cetera. Do you do it like that? No. What you do is, since you are communicating between the client and the server using the middleware, implement that middleware as an abstract interface, which has a function send, for example, right? And that abstract interface can be plugged into the file, injected into the client code, and provide in-memory implementation.

Starting point is 00:30:14 Instead of sending data over physical network, send the data over memory, just dump it into the vector of bytes, right? And do the same thing for the server side. So during your setup code, you basically replace that interface, the actual network layer with your own implementation, right? Which is in-memory implementation. And you can test your system end-to-end without external dependencies.

Starting point is 00:30:40 I'll give you another example. For example, let's talk about, let's say there is a MySQL SQL Server database, right? You cannot test it unless you start up the service, right? Right. But you could do this with SQLite, correct? I think SQLite is something that you can bring up in the process, right? You don't need the actual service instance running, right? So if the SQL Server developers or MySQL developers thought ahead a little bit and thought,

Starting point is 00:31:10 like, how are we going to make our system much easier to test that would provide ability to start some kind of light database engine on the client without any kind of external dependencies that works completely in memory, right, with a simple interface, and you can test your code this way. In fact, recently I looked into one of the graph databases. I think it's Neo4j, and they actually did that. They said, okay, guys, if you want to unit test,

Starting point is 00:31:37 you don't have to run the database engine that is in the memory implementation. So when I write my code, you don't always have this chance because sometimes you depend on other systems, right? Or on other open source projects, which simply don't. But if you're working on an infrastructure component, you should always design it for clients so they can test it easily without external dependencies. You will start thinking differently, right? You'll say, oh, you know what? I have, let's say, optimization engine, which I want to separate so that I could run my stuff

Starting point is 00:32:11 in a lighter mode. Oh, it actually leads to much better design. So this is, you do want to turn your integration tests into unit tests. That's the goal. It's not always possible, but that's

Starting point is 00:32:25 what we want to do. I want to interrupt the discussion for just a moment to bring you a word from our sponsors. Clang Power Tools is the open source Visual Studio extension on a mission to bring LLVM tools like Clang++, ClangTidy, and ClangFormat to C++ developers on Windows. Increase your productivity and modernize your C++ code with automatic code transformations powered by Clang Tidy. Find subtle latent bugs by using LVM Static Analyzer and CPP Core Guidelines checks. Don't fiddle with command line switches or learn hundreds of compiler flags. Clang Power Tools comes with a powerful user interface to help you manage the complexity directly from the comfort of your IDE. Never miss a potential issue by integrating Clang Power Tools into your

Starting point is 00:33:03 workflows with a configurable PowerShell script for CI-CD automation. Start experimenting on your code today. Check it out at clangpowertools.com. So you talked about these three types of testing that you focus on. Do you assign weight or importance differently on them? Like how many unit tests do you have compared to the number of integration or interaction testing? That's actually a great question. Unit tests are usually applicable to lower level components, right?

Starting point is 00:33:31 And you do want to test them in and out, like you want to do it to extreme. You want to have complete coverage. Now, as you go higher level, it becomes almost impossible. Because the amount of interactions, right, the amount of ways you can, you have so many components in the pipeline, right? The amount of, it becomes extremely difficult, right?

Starting point is 00:33:53 So your number of tests will go down, but you do want to have tests that will sort of, that will sanitize your code, sort of speak, right? I mean, in this case, like approval testing, right? You can generate a bunch of, let's say, files, the input files, which becomes a golden copy, right? And then whenever you change your code, right, make sure there are no bugs. You can run through the same regression, right? And hopefully nothing is broken. Everything will get passed, right? And hopefully nothing is broken. Everything will get passed, right?

Starting point is 00:34:26 But it becomes very difficult. It becomes very difficult to test as you go higher level. So more tests on the unit side, there will naturally be less tests on interaction integration side, but they have to be. So I think 80-20 rule works here. So there will be 80% of tests will be on the unit side and 20% on the integration side.

Starting point is 00:34:49 Okay. So I'm thinking of a couple of different things here. I've known some people that say that they don't really do unit tests. They only do behavior tests because they want to be able to swap out the implementation as long as they still get the same behavior from the system. Now, I was not thinking about that when I was working on my scripting engine chai script years ago which just for the record i haven't maintained in a long time now but um coincidentally we have almost no test in there that tests the actual c++ like test like i parsed this and i expect this ast back instead we just wrote things that only test the scripting engine.

Starting point is 00:35:27 Like if I put the 5 in, do I get an integer 5 back out? If I do 5 plus 4, do I get 9 back out of the system? Which gave us very good code coverage ultimately and let us know if any of the behavior broke in the system. But I'm curious about your opinion. Would you say that I did a bad job testing the system i think you did more interaction tests than unit tests all right you you do want to test every component and i think the behavior during testing applicable to unit test or interaction test it's just it's sort of like synthetic sugar right

Starting point is 00:36:02 depending on the framework we use with like, like, if, when, giving certain right input, when, you know, you do, like, I think Catch has the support set, right? G-Test do not. But I think what you did, you did more interaction tests. And I see it all over the place. You did a good job because you do have coverage, but I don't think you have enough unit tests because you do want to test your parser.

Starting point is 00:36:28 You do want to test your lexer, right? You do want to test every low-level component. Right. For example, if you write a string class, are you going to test your string class from your client code by using it, or are you going to write unit tests for the string class? You should probably write the unit tests for the string class, right? That's a good point, yeah.

Starting point is 00:36:48 Yeah, so I think you have more interaction tests. I don't think you have any external dependencies there, right, because it's a scripting language. Yeah, I know. So, but possibly you don't have enough unit tests. Yeah. And I think one of the most important tests is the unit test, because once you test your low-level components, if you have a good coverage, you just rely on them working. You never test your low-level components in terms of high-level components.

Starting point is 00:37:15 You just rely on them working. That's it. And obviously, you need the test, end-to-end test, like you provide an input 5 in a deterrence or something else, right? Yeah. Those are also replaceable. Sometimes people will go wild with unit tests, like you provide an input 5 and it returns to something else, right? Those are also replaceable. Sometimes people will go wild with unit tests and they will shine

Starting point is 00:37:30 the integration tests. There is another extreme like that. They will mock everything, right? I think mocking is useful to some extent to break the dependency, right? If you don't control the infrastructure, let's say, and you're talking to a particular service by giving some, I don't know, infrastructure, let's say, and you're talking to a particular

Starting point is 00:37:45 service by giving some, I don't know, JSON input, you get some JSON output, right? So you don't want to start up that service and the whole infrastructure. So you mark it out. Usually what you do, you have a class abstract interface with some methods, you know, like gets, you know, request some data and it returns you some kind of response. So you mock it in the real implementation. It will go into that service and will execute code there, right? But in your mocked implementation, you're just going to basically, whenever you call get some data, you will return hard-coded

Starting point is 00:38:19 or whatever other means you do that. So that you break the dependency and you can unit test your components that interact with other components over external dependency, right? But some people take it to the extreme. They will mark everything, right? And I don't think it's useful because you could do that but you will have to write tremendous amount of what I think could be unnecessary test because at the end

Starting point is 00:38:55 of the day you do want to do integration testing right and integration tests usually will catch a lot of errors that one missed during the marking. Because then you want to test the whole system. Going back to my database example, right? I had only one, but I did not really have marking. I mean, I did have marking there, but it was very limited. The way I achieved that, you can also say it's marking. I replaced all the communication with the abstract interface so that I can test everything

Starting point is 00:39:25 in memory, right? But I would not mark every component interaction between any other component. No. I would just use the other components as they are because they execute... If you have a string and a vector and you build some, let's say,

Starting point is 00:39:41 some other class which includes the string and the vector, you don't mark don't know, like some other class which includes the string and the vector, you don't mark that class. And there is another class which includes that class. Like, no, you just use those classes like you did with your scripting, right? At the end of the day, you don't mark. You have your parser, you have your lexer, right? You have something else going there

Starting point is 00:39:59 and you have a client interface. You provide an input, you get an output. I'm a fan of that. You should really test each component with unit tests, fine. But at least you have proper interaction tests, right? So, in fact, you know, once I gave the presentation, and, you know, I decided to make it a little controversial, and I was trying to prove the point that when you have external dependency

Starting point is 00:40:23 and if you control the infrastructure, it's much better to instead of marking, just provide the same memory implementations and a lot of times just test the system as a whole. So my second slide would say like marking is evil, right? Which is obvious that that was not the case. I was just trying to show that, you know, just wake up people, right? Now they're paying attention, right? So I think you should know when to mark and when not to mark. But sometimes you have to do it. Especially marking is good when you do test error conditions.

Starting point is 00:40:56 For example, if you are testing your database, even if you have your in-memory implementation, right, how do you generate connection errors, right? How do you generate that something bad happened? It's difficult, right? Because it's difficult. So then you mock. That's a perfect case for mocking. Yeah.

Starting point is 00:41:17 Another thing I wanted to mention. I would like to say I really like that point, that mocking is particularly useful for the error case. I find that very interesting. Yes, yes. For error case, it's the perfect case for mocking, right? Right. Only way to consistently guarantee you're going to get an error.

Starting point is 00:41:33 Yeah. Exactly, exactly. But I would be against mocking every component in the system. That would just be insanity. And I've seen people do that. Now, given what we just discussed, right, like I think you could see that by following these practices, right, just write modular code, don't create cycles, and test your code, it will not let you make bad design decisions. It's like you become your own compiler, so to speak, right? If you just discipline enough that you follow this very easy, like,

Starting point is 00:42:06 writing small functions is very easy principle to follow. And it's, there are beyond, like, there are books, you know, there is a single responsibility principles, you have to have the smallest amount of parameters, this, that, and like, when you try to explain something to someone,

Starting point is 00:42:21 you want to make it simple. The more simple you make, the easier it will be accepted. So, I think writing small functions is a very easy concept. Creating no cycles is also an easy concept. So are tests. Now, I have my other three sort of

Starting point is 00:42:37 statements, which I want to run by you, which also guide my software design. If a component is hard to test, it's not properly designed. Okay. Number one. If a component is easy to test, it is a good indication that it's proper. It doesn't mean you can still have mess inside the code implementation,

Starting point is 00:42:58 but it's a good indication that it's properly designed. Now, if a component is properly designed, it's easy to test. So whenever I write a class, I'm asking myself these three questions. In general, I ask myself, is it easy to test? If it's easy to test, it's a good indication that it's properly designed. And that's what design for testing means for me. Now, I appreciate those guidelines sound good, but how do you – it sounds like a lot of what you're talking about, you're like, well, unit test, a lot, but not too much, mock some, but not all the things. How do we start to develop this mindset, this intuition for what is good testable code.

Starting point is 00:43:46 What is good testable code? Like, how do we, I mean, we have to practice, it sounds like. Right, right, right. The only way, it's a paradigm, it's a paradigm change. For me, it was like, I find it difficult sometimes to bring it over, right, the finish line. But once, like like once you get it you cannot do things otherwise so unit tests a lot there is no end to unit test that you can write now interaction integration test reasonable amount but

Starting point is 00:44:17 you are limited because they are literally you could have like factorial explosion there right because there are so many ways you can use the components. Marking only when you need to test for errors, conditions, or if there is a complex dependency, not external dependency, but even like within your project, sometimes you would do marking just to separate another part of the system so you can test it easy. But I wouldn't mark everything. Absolutely not.

Starting point is 00:44:52 And integration tests, try to turn them into unit tests, interaction tests, by removing external dependencies. And the way to do it is use dependency injection. Replace your communication protocols or protocols that talk to file systems or message queues, databases within memory implementations. It is sort of like mocking, except that the actual code is executed. It's just whatever would be executed in the server side is actually executed in your own process, basically, right?

Starting point is 00:45:21 So you want to turn all your tests into in-process tests. Now, the way people achieve this this day, I mean, they don't really achieve that, but they use Docker for the integration test so they have the environment

Starting point is 00:45:31 because the setup is very difficult, right? So they have the environment, they make sure all the dependencies are properly set up, they start up the container, right,

Starting point is 00:45:40 and they run the integration tests like that, which is great, but it's not pure in the sense that you cannot – I tend to write my infrastructure code so that I can test it in process. Yeah, it almost sounds like you would think of that as cheating. I didn't want to say that, but it's a good cheating, right?

Starting point is 00:45:59 I would rather do that than not do integration tests because to set up the proper environment is difficult, right? Especially if you – I mean, it could be difficult, let's say it this way, right? It is – I don't want to call it cheating. I think it's a good approach. It's a good compromise. Well, I've had a couple of students say, oh, well, you know, I've got this class and I want to test it. I need to mock it. So I made all the member functions virtual so that I can derive from it and do the mocking kind of interface with it.

Starting point is 00:46:32 And I respond to them and I say, I am not an expert in testing methodologies, but that doesn't feel right to me. If that's the only reason you have virtual functions. Absolutely right, Jason. This is what I call mocks are evil. I mean, no, quote, unquote. Yes. I would not do that. No.

Starting point is 00:46:50 I would do that when I need to, when my class, let's say, returns an error condition, which is difficult to replicate by providing input. For example, if you are passing to database engine valid SQL statement, you're probably not going to get connection error, right? How do you simulate connection error? So in that sense, if you have like some client for the database and you want to simulate connection error and see how your actual code will respond to it, marking is great. Marking is also good when you have external dependency, you don't control it, and you don't want to bring the whole environment up, right? So you break your external dependency marking.

Starting point is 00:47:29 Okay, acceptable. And also, even if you don't have external dependencies, but when you have complex interactions, like really complex, which requires very complex setup, and between like two subsystems, so you can mark another subsystem just okay that's that's acceptable but not every class no okay the other thing uh that kind of you know concerns me for for lack of a better word when i think about mocking and dependency injecting as i go, well, it feels like now I'm testing something different. Like, I'm not testing the actual system. I'm testing the mocked system. Yes. But your arguments for being able to consistently generate errors, and it all makes sense to me, but I'm just curious,

Starting point is 00:48:18 has that ever been a problem? Have you ever gotten down the road and then go, oh, no, my mock actually gave me a false sense of security somehow? Well, for errors, no. Okay. And for dependency injection, I just want to, like, maybe slightly correctly, I'm not sure if I understood it correctly. When you use dependency injection the way I described it by replacing the actual protocol, so it's a TCPAP within memory implementation, you are actually testing your whole system.

Starting point is 00:48:47 That's the whole point. Because you are executing the code on the other side, which would be executed on the server side. You are doing end-to-end. That's the whole point of it. You've just removed the operating system and the internet out of it. You just removed the TCP communication. And now, should you test with the actual TCP and P component, right? Like instead of doing dependency injection, should you actually test with the interface which implements? Of course. That's going to be the real integration test.

Starting point is 00:49:16 They are irreplaceable. But they are a little slower to run, right? You have to have an environment, right? So you try to make your testing easier, design for testing. It will drive your design, and it's going to create an extremely stable system. I mean, look, I worked in a lot of different projects, right? I speak mostly from pragmatic experience than from, like, academia experience. So when you do things over and over again right

Starting point is 00:49:46 and you could work in production and it's easy to maintain and i i just told you the real story right i will not mention the company right but like i where it happened but i told the real story you think like you must be doing something right and then why not to share it with the world sure so my point is that in my previous experience yes i did not write unit tests i think jason i want to ask you when was the first time you wrote the unit test i'm actually interested like when was the actual first time you wrote the first unit test uh well so i will say chai script that that project was the first one that i worked on that actually had test so i will say you know you know, in my defense, but also...

Starting point is 00:50:26 No, I'm not putting you on the spot. But also to condemn myself, in a sense, I didn't even write the first test. My cousin who was working on the project with me is the one who wrote the first test. He's like, we need to have tests. I'm like, that sounds like a good idea. So go ahead and do that testing thing, right? Like, I wasn't used to this at all. Yes, true.

Starting point is 00:50:47 I heard, I think it was a panel between Strostrop, Brock Pike, Alessandresco, and someone mentioned the unit tests were created for untyped, interpreted type of languages, because C++ gives you static time.

Starting point is 00:51:04 That's true, maybe, but I think unit tests are useful across the whole industry. And in fact, I think they do drive your design. Whatever you do, think, is it easy to test? If it's not easy to test, red flag.

Starting point is 00:51:20 Your component is not properly designed. That's simple, right? You don't have to read 2,000 pages to understand that. You write a class. That's simple, right? Like, you don't have to read, like, 2,000 pages to understand that. You write a class. Is it hard to test? Yes. Wrong design.

Starting point is 00:51:32 Easy to test? Yes. Probably did a good job. Do you mind if I quote you in the testing section on my C++ book? Of course. I will send you those principles, you know, the three things.

Starting point is 00:51:43 Absolutely. Okay. Go ahead, Jason. I was just going to say, it's funny, because through the whole course of this, I've got in the back of my mind some code that I wrote yesterday that I'm like, this code is being used everywhere, so I know that it's tested, and it is a strongly typed interface, but I should probably go ahead and write a unit test for it. I wasn't thinking about it for that particular function. So, yeah, I'm going to do that when we get off the call.

Starting point is 00:52:07 The rule is like that. Every public function or every free function, that's easy, right? Every public function at least should have a test probably with the same name, at least one, maybe more, depending on the corner cases and the complexities, right? I feel like I'm seeing, you know, kind of like a complete picture from what I'm hearing from you, because like you said at the beginning, that people get caught up in these different terminologies. And it sounds to me like you're in a way saying test at every level as much as you can. Exactly. But be pragmatic, but never be a perfectionist because you'll never get anything done. Right?

Starting point is 00:52:42 Yeah. Oh, by the way, as far as I will give you a good example. In my code, probably 60% of my code attests. Yes, yes, yes. Most of the code that I write attests.

Starting point is 00:52:58 Is that like 100% code coverage then? Well, I hope so, but you know, sometimes when you use those tools that give you the coverage reports, just because you have coverage doesn't mean you test every code within your function. But yes, it is close to that. You will never be perfect. It's impossible.

Starting point is 00:53:18 If you're a perfectionist, you will never deliver all the time you'll ever get, but you have to be pragmatic and you have to be pragmatic, and you have to be disciplined, right? That's the point. So if you want to look at how much testing code you should be, it probably should be 50% of your code should be tests. That's my guide. Someone will say it should be even more. I will not argue, probably agree, but I'm at about 50% to 60%.

Starting point is 00:53:40 So looping back around to where we started with that poll, about 34% of developers not using unit tests. If you're one of those in the listeners and they've been convinced they need to do some unit testing, do you have any things to point them to, to learn a little bit more, to get started? Any

Starting point is 00:53:59 resources you might suggest? Any resources? There are plenty of books, but I think the best resources would be maybe finding open source projects and just good quality open source projects and take a look at how they're

Starting point is 00:54:15 tested. Also, YouTube videos, a bunch of YouTube videos. There's just plenty of stuff. For me, it was the real livelife experience, right? There is one particular book that I would recommend. It's called Clean Code by Uncle Bob, I think. It's been around for a long time.

Starting point is 00:54:34 Right. Interestingly enough, when I read it, I did not read the section about the unit tests. So I read it a long time ago. And then I'm like, I read read it didn't talk about this when i looked at look back oh wow so he's advocating writing your test before writing code which is fine if you could do that do that but the main thing is test your code make your test to drive your design if you have difficulty thinking how you can design it's very by the way software

Starting point is 00:55:02 design is extremely extremely difficult difficult, right? To achieve good quality code, small functions, simple. No cycles and tests. If you do that, 80% you're there. The rest are details. I just want to, if you don't mind, mention again that you

Starting point is 00:55:19 gave some techniques for testing error cases because in my experience now as a more mature developer compared to the older projects that I started on, it seems to me that the error handling routines are never tested. Like when I look at code coverage reports on some of the projects I'm involved in,

Starting point is 00:55:36 I even submitted a test at one point on one project that had an error in it intentionally because I needed to test the error handling routine, and that pull request was denied. We don't want code with errors in it in our unit test framework. And I'm like, but we need to test that. Yeah, you need to test that, absolutely. Whether you return the error or you throw an exception,

Starting point is 00:55:56 the piece of code that, like, every code path should be executing during your unit tests. That's advisable. And if it's difficult to generate the errors, right, that's when the marking comes handy, right? Right. Sometimes you can generate errors simply from the input.

Starting point is 00:56:13 You provide invalid input, it's, let's say, it's not undefined behavior, but there is a valid error code or exception throw. By the way, I'm a big fan of exceptions. I hate error codes. I have no problem with exceptions. Yes. Remember, like, I remember in 2016, Eric Nibler, right,

Starting point is 00:56:30 who spoke about ranges at the time, and they're like, but you have to use exceptions. And he said, like, but the amount of time, did you actually profile? The amount of time you have your if statement, which could potentially throw your compiler off or the CPU executor off, right? Probably if you do exceptions, it's not going to

Starting point is 00:56:54 cause any kind of performance problems. You should probably measure your code before you decide whether you use exceptions or error codes. Yeah. Well, Oleg, it's been great having you on the show today. Where can listeners find you online? Well, I'm not really social as far as Facebook and Twitter.

Starting point is 00:57:18 I have a LinkedIn profile, and my email, you can post it into the notes. I have no problem with people reaching out directly. And I feel very passionate about what I do. If you find, let's say, flaws in my philosophy, please reach out. Prove it to me. Show me how it could be done better, right? This is what I always, this is my pushback. Show me. But I do have multiple successful projects behind me that are stable, that are working, that are in production, you know, and that are maintainable. And Jason, if you want to mention some of the stuff,

Starting point is 00:57:51 I can write you an email in those principles. Just like if component is hard to test, it's not properly designed. I think that's specifically what I want to quote, actually, that right there. So I will do that. All right, awesome. Thank you.

Starting point is 00:58:04 Thank you, guys. Thank you. Thanks so, guys. Thank you. Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, we'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. Thank you. And of course, you can find all that info and the show notes on the podcast website at cppcast.com.

Starting point is 00:58:47 Theme music for this episode is provided by podcastthemes.com.

Your Ad Here

CppCast - Unit Testing

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.